Re: [RFC PATCH] Fix pointer diff (was: -fsanitize=pointer-overflow support (PR sanitizer/80998))
On October 28, 2017 2:53:56 PM GMT+02:00, Marc Glissewrote: > >I am sending the new version of the patch in a separate email, to make >it >more visible, and only replying to a few points here. > >On Thu, 19 Oct 2017, Richard Biener wrote: > >> On Mon, Oct 9, 2017 at 12:55 PM, Marc Glisse >wrote: >>> On Mon, 3 Jul 2017, Richard Biener wrote: >>> On Sat, 1 Jul 2017, Marc Glisse wrote: > I wrote a quick prototype to see what the fallout would look like. > Surprisingly, it actually passes bootstrap+testsuite on ppc64el >with all > languages with no regression. Sure, it is probably not a complete > migration, there are likely a few places still converting to >ptrdiff_t > to perform a regular subtraction, but this seems to indicate that >the > work isn't as bad as using a signed type in pointer_plus_expr for > instance. The fold_binary_loc hunk looks dangerous (it'll generate MINUS_EXPR from POINTER_MINUS_EXPR in some cases I guess). The tree code needs documenting in tree.def and generic.texi. Otherwise ok(*). Thanks, Richard. (*) ok, just kidding -- or maybe not >>> >>> >>> I updated the prototype a bit. Some things I noticed: >>> >>> - the C front-end has support for address spaces that seems to imply >that >>> pointer subtraction (and division by the size) may be done in a type >larger >>> than ptrdiff_t. Currently, it generates >>> (ptrdiff_t)(((inttype)q-(inttype)p)/size) for q-p where inttype is >some type >>> potentially larger than ptrdiff_t. >> >> It uses a ptrdiff_t corresponding to the respective address space, >yes. >> That we use sizetype elsewhere unconditionally is a bug :/ >> >>> I am thus generating a POINTER_DIFF_EXPR >>> with that type, while I was originally hoping its type would always >be >>> ptrdiff_t. It complicates code and I am not sure I didn't break >address >>> spaces anyway... (expansion likely needs to do the inttype dance) >> >> I think that's fine. Ideally targets would provide a type to use for >each >> respective address space given we have targets that have sizetype >smaller >> than ptr_mode. >> >>> Are ptrdiff_t (what POINTER_DIFF_EXPR should return) and size_t >(what >>> POINTER_PLUS_EXPR takes as second argument) always the same type >>> signed/unsigned? >> >> POINTER_PLUS_EXPR takes 'sizetype', not size_t. > >Ah, yes, that's the one I meant... > >> So the answer is clearly no. And yes, that's ugly and broken and >should be fixed. >> >>> Counter-examples: m32c (when !TARGET_A16), visium, darwin, >>> powerpcspe, s390, vms... and it isn't even always the same that is >bigger >>> than the other. That's quite messy. >> >> Eh. Yeah, targets are free to choose 'sizetype' and they do so for >> efficiency. IMHO we should get rid of this "feature". >> >>> - I had to use @@ in match.pd, not for constants, but because in >GENERIC we >>> sometimes ended up with pointers where operand_equal_p said yes but >>> types_match disagreed. >>> >>> - It currently regresses 2 go tests: net/http runtime/debug >> >> Those are flaky for me and fail sometimes and sometimes not. > >Yes, I noticed that (there are 1 or 2 others in the go testsuite). > >> +@item POINTER_DIFF_EXPR >> +This node represents pointer subtraction. The two operands always >> +have pointer/reference type. The second operand is always an >unsigned >> +integer type compatible with sizetype. It returns a signed integer. >> >> the 2nd sentence looks bogusly copied. > >Oups, removed. > >> >> + /* FIXME. */ >> + if (code == POINTER_DIFF_EXPR) >> + return int_const_binop (MINUS_EXPR, >> + fold_convert (ptrdiff_type_node, >arg1), >> + fold_convert (ptrdiff_type_node, >arg2)); >> >> wide_int_to_tree (type, wi::to_widest (arg1) - wi::to_widest >(arg2)); > >Before your reply, I wrote something similar using offset_int instead >of >widest_int (and handling overflow, mostly for documentation purposes). >I >wasn't sure which one to pick, I can change to widest_int if you think >it >is better... Offset_int is better. >> +case POINTER_DIFF_EXPR: >> + { >> + if (!POINTER_TYPE_P (rhs1_type) >> + || !POINTER_TYPE_P (rhs2_type) >> + // || !useless_type_conversion_p (rhs2_type, rhs1_type) >> >> types_compatible_p (rhs1_type, rhs2_type)? > >Makes sense. > >> + // || !useless_type_conversion_p (ptrdiff_type_node, >lhs_type)) >> + || TREE_CODE (lhs_type) != INTEGER_TYPE >> + || TYPE_UNSIGNED (lhs_type)) >> + { >> + error ("type mismatch in pointer diff expression"); >> + debug_generic_stmt (lhs_type); >> + debug_generic_stmt (rhs1_type); >> + debug_generic_stmt (rhs2_type); >> + return true; >> >> there's also verify_expr which could want adjustment for newly >created >> tree kinds. > >ok. >
Re: [RFC PATCH] Fix pointer diff (was: -fsanitize=pointer-overflow support (PR sanitizer/80998))
I am sending the new version of the patch in a separate email, to make it more visible, and only replying to a few points here. On Thu, 19 Oct 2017, Richard Biener wrote: On Mon, Oct 9, 2017 at 12:55 PM, Marc Glissewrote: On Mon, 3 Jul 2017, Richard Biener wrote: On Sat, 1 Jul 2017, Marc Glisse wrote: I wrote a quick prototype to see what the fallout would look like. Surprisingly, it actually passes bootstrap+testsuite on ppc64el with all languages with no regression. Sure, it is probably not a complete migration, there are likely a few places still converting to ptrdiff_t to perform a regular subtraction, but this seems to indicate that the work isn't as bad as using a signed type in pointer_plus_expr for instance. The fold_binary_loc hunk looks dangerous (it'll generate MINUS_EXPR from POINTER_MINUS_EXPR in some cases I guess). The tree code needs documenting in tree.def and generic.texi. Otherwise ok(*). Thanks, Richard. (*) ok, just kidding -- or maybe not I updated the prototype a bit. Some things I noticed: - the C front-end has support for address spaces that seems to imply that pointer subtraction (and division by the size) may be done in a type larger than ptrdiff_t. Currently, it generates (ptrdiff_t)(((inttype)q-(inttype)p)/size) for q-p where inttype is some type potentially larger than ptrdiff_t. It uses a ptrdiff_t corresponding to the respective address space, yes. That we use sizetype elsewhere unconditionally is a bug :/ I am thus generating a POINTER_DIFF_EXPR with that type, while I was originally hoping its type would always be ptrdiff_t. It complicates code and I am not sure I didn't break address spaces anyway... (expansion likely needs to do the inttype dance) I think that's fine. Ideally targets would provide a type to use for each respective address space given we have targets that have sizetype smaller than ptr_mode. Are ptrdiff_t (what POINTER_DIFF_EXPR should return) and size_t (what POINTER_PLUS_EXPR takes as second argument) always the same type signed/unsigned? POINTER_PLUS_EXPR takes 'sizetype', not size_t. Ah, yes, that's the one I meant... So the answer is clearly no. And yes, that's ugly and broken and should be fixed. Counter-examples: m32c (when !TARGET_A16), visium, darwin, powerpcspe, s390, vms... and it isn't even always the same that is bigger than the other. That's quite messy. Eh. Yeah, targets are free to choose 'sizetype' and they do so for efficiency. IMHO we should get rid of this "feature". - I had to use @@ in match.pd, not for constants, but because in GENERIC we sometimes ended up with pointers where operand_equal_p said yes but types_match disagreed. - It currently regresses 2 go tests: net/http runtime/debug Those are flaky for me and fail sometimes and sometimes not. Yes, I noticed that (there are 1 or 2 others in the go testsuite). +@item POINTER_DIFF_EXPR +This node represents pointer subtraction. The two operands always +have pointer/reference type. The second operand is always an unsigned +integer type compatible with sizetype. It returns a signed integer. the 2nd sentence looks bogusly copied. Oups, removed. + /* FIXME. */ + if (code == POINTER_DIFF_EXPR) + return int_const_binop (MINUS_EXPR, + fold_convert (ptrdiff_type_node, arg1), + fold_convert (ptrdiff_type_node, arg2)); wide_int_to_tree (type, wi::to_widest (arg1) - wi::to_widest (arg2)); Before your reply, I wrote something similar using offset_int instead of widest_int (and handling overflow, mostly for documentation purposes). I wasn't sure which one to pick, I can change to widest_int if you think it is better... +case POINTER_DIFF_EXPR: + { + if (!POINTER_TYPE_P (rhs1_type) + || !POINTER_TYPE_P (rhs2_type) + // || !useless_type_conversion_p (rhs2_type, rhs1_type) types_compatible_p (rhs1_type, rhs2_type)? Makes sense. + // || !useless_type_conversion_p (ptrdiff_type_node, lhs_type)) + || TREE_CODE (lhs_type) != INTEGER_TYPE + || TYPE_UNSIGNED (lhs_type)) + { + error ("type mismatch in pointer diff expression"); + debug_generic_stmt (lhs_type); + debug_generic_stmt (rhs1_type); + debug_generic_stmt (rhs2_type); + return true; there's also verify_expr which could want adjustment for newly created tree kinds. ok. So if the precision of the result is larger than that of the pointer operands we sign-extend the result, right? So the subtraction is performed in precision of the pointer operands and then sign-extended/truncated to the result type? Which means it is _not_ a widening subtraction to get around the half-address-space issue. The tree.def documentation should reflect this semantic detail. Not sure if the RTL expansion follows it. I actually have no idea what expansion
Re: [RFC PATCH] Fix pointer diff (was: -fsanitize=pointer-overflow support (PR sanitizer/80998))
On Mon, Oct 9, 2017 at 12:55 PM, Marc Glissewrote: > On Mon, 3 Jul 2017, Richard Biener wrote: > >> On Sat, 1 Jul 2017, Marc Glisse wrote: >> >>> I wrote a quick prototype to see what the fallout would look like. >>> Surprisingly, it actually passes bootstrap+testsuite on ppc64el with all >>> languages with no regression. Sure, it is probably not a complete >>> migration, there are likely a few places still converting to ptrdiff_t >>> to perform a regular subtraction, but this seems to indicate that the >>> work isn't as bad as using a signed type in pointer_plus_expr for >>> instance. >> >> >> The fold_binary_loc hunk looks dangerous (it'll generate MINUS_EXPR >> from POINTER_MINUS_EXPR in some cases I guess). >> >> The tree code needs documenting in tree.def and generic.texi. >> >> Otherwise ok(*). >> >> Thanks, >> Richard. >> >> (*) ok, just kidding -- or maybe not > > > I updated the prototype a bit. Some things I noticed: > > - the C front-end has support for address spaces that seems to imply that > pointer subtraction (and division by the size) may be done in a type larger > than ptrdiff_t. Currently, it generates > (ptrdiff_t)(((inttype)q-(inttype)p)/size) for q-p where inttype is some type > potentially larger than ptrdiff_t. It uses a ptrdiff_t corresponding to the respective address space, yes. That we use sizetype elsewhere unconditionally is a bug :/ I am thus generating a POINTER_DIFF_EXPR > with that type, while I was originally hoping its type would always be > ptrdiff_t. It complicates code and I am not sure I didn't break address > spaces anyway... (expansion likely needs to do the inttype dance) I think that's fine. Ideally targets would provide a type to use for each respective address space given we have targets that have sizetype smaller than ptr_mode. > Are ptrdiff_t (what POINTER_DIFF_EXPR should return) and size_t (what > POINTER_PLUS_EXPR takes as second argument) always the same type > signed/unsigned? POINTER_PLUS_EXPR takes 'sizetype', not size_t. So the answer is clearly no. And yes, that's ugly and broken and should be fixed. > Counter-examples: m32c (when !TARGET_A16), visium, darwin, > powerpcspe, s390, vms... and it isn't even always the same that is bigger > than the other. That's quite messy. Eh. Yeah, targets are free to choose 'sizetype' and they do so for efficiency. IMHO we should get rid of this "feature". > - I had to use @@ in match.pd, not for constants, but because in GENERIC we > sometimes ended up with pointers where operand_equal_p said yes but > types_match disagreed. > > - It currently regresses 2 go tests: net/http runtime/debug Those are flaky for me and fail sometimes and sometimes not. +@item POINTER_DIFF_EXPR +This node represents pointer subtraction. The two operands always +have pointer/reference type. The second operand is always an unsigned +integer type compatible with sizetype. It returns a signed integer. the 2nd sentence looks bogusly copied. + /* FIXME. */ + if (code == POINTER_DIFF_EXPR) + return int_const_binop (MINUS_EXPR, + fold_convert (ptrdiff_type_node, arg1), + fold_convert (ptrdiff_type_node, arg2)); wide_int_to_tree (type, wi::to_widest (arg1) - wi::to_widest (arg2)); ? +case POINTER_DIFF_EXPR: + { + if (!POINTER_TYPE_P (rhs1_type) + || !POINTER_TYPE_P (rhs2_type) + // || !useless_type_conversion_p (rhs2_type, rhs1_type) types_compatible_p (rhs1_type, rhs2_type)? + // || !useless_type_conversion_p (ptrdiff_type_node, lhs_type)) + || TREE_CODE (lhs_type) != INTEGER_TYPE + || TYPE_UNSIGNED (lhs_type)) + { + error ("type mismatch in pointer diff expression"); + debug_generic_stmt (lhs_type); + debug_generic_stmt (rhs1_type); + debug_generic_stmt (rhs2_type); + return true; there's also verify_expr which could want adjustment for newly created tree kinds. So if the precision of the result is larger than that of the pointer operands we sign-extend the result, right? So the subtraction is performed in precision of the pointer operands and then sign-extended/truncated to the result type? Which means it is _not_ a widening subtraction to get around the half-address-space issue. The tree.def documentation should reflect this semantic detail. Not sure if the RTL expansion follows it. I think that we'd ideally have a single legal INTEGER_TYPE precision per pointer type precision and that those precisions should match... we don't have to follow the mistakes of POINTER_PLUS_EXPR. So ... above verify TYPE_PRECISION (rhs1_type) == TYPE_PRECISION (lhs_type)? Some targets have 24bit ptr_mode but no 24bit integer type which means the FE likely chooses 32bit int for the difference computation. But the middle-end should be free to create a 24bit precision type with SImode. Otherwise as said
Re: [RFC PATCH] Fix pointer diff (was: -fsanitize=pointer-overflow support (PR sanitizer/80998))
On Mon, 3 Jul 2017, Richard Biener wrote: On Sat, 1 Jul 2017, Marc Glisse wrote: I wrote a quick prototype to see what the fallout would look like. Surprisingly, it actually passes bootstrap+testsuite on ppc64el with all languages with no regression. Sure, it is probably not a complete migration, there are likely a few places still converting to ptrdiff_t to perform a regular subtraction, but this seems to indicate that the work isn't as bad as using a signed type in pointer_plus_expr for instance. The fold_binary_loc hunk looks dangerous (it'll generate MINUS_EXPR from POINTER_MINUS_EXPR in some cases I guess). The tree code needs documenting in tree.def and generic.texi. Otherwise ok(*). Thanks, Richard. (*) ok, just kidding -- or maybe not I updated the prototype a bit. Some things I noticed: - the C front-end has support for address spaces that seems to imply that pointer subtraction (and division by the size) may be done in a type larger than ptrdiff_t. Currently, it generates (ptrdiff_t)(((inttype)q-(inttype)p)/size) for q-p where inttype is some type potentially larger than ptrdiff_t. I am thus generating a POINTER_DIFF_EXPR with that type, while I was originally hoping its type would always be ptrdiff_t. It complicates code and I am not sure I didn't break address spaces anyway... (expansion likely needs to do the inttype dance) Are ptrdiff_t (what POINTER_DIFF_EXPR should return) and size_t (what POINTER_PLUS_EXPR takes as second argument) always the same type signed/unsigned? Counter-examples: m32c (when !TARGET_A16), visium, darwin, powerpcspe, s390, vms... and it isn't even always the same that is bigger than the other. That's quite messy. - I had to use @@ in match.pd, not for constants, but because in GENERIC we sometimes ended up with pointers where operand_equal_p said yes but types_match disagreed. - It currently regresses 2 go tests: net/http runtime/debug -- Marc GlisseIndex: gcc/c/c-fold.c === --- gcc/c/c-fold.c (revision 253536) +++ gcc/c/c-fold.c (working copy) @@ -238,20 +238,21 @@ c_fully_fold_internal (tree expr, bool i case COMPOUND_EXPR: case MODIFY_EXPR: case PREDECREMENT_EXPR: case PREINCREMENT_EXPR: case POSTDECREMENT_EXPR: case POSTINCREMENT_EXPR: case PLUS_EXPR: case MINUS_EXPR: case MULT_EXPR: case POINTER_PLUS_EXPR: +case POINTER_DIFF_EXPR: case TRUNC_DIV_EXPR: case CEIL_DIV_EXPR: case FLOOR_DIV_EXPR: case TRUNC_MOD_EXPR: case RDIV_EXPR: case EXACT_DIV_EXPR: case LSHIFT_EXPR: case RSHIFT_EXPR: case BIT_IOR_EXPR: case BIT_XOR_EXPR: Index: gcc/c/c-typeck.c === --- gcc/c/c-typeck.c (revision 253536) +++ gcc/c/c-typeck.c (working copy) @@ -3772,21 +3772,21 @@ parser_build_binary_op (location_t locat && TREE_CODE (type2) == ENUMERAL_TYPE && TYPE_MAIN_VARIANT (type1) != TYPE_MAIN_VARIANT (type2)) warning_at (location, OPT_Wenum_compare, "comparison between %qT and %qT", type1, type2); return result; } /* Return a tree for the difference of pointers OP0 and OP1. - The resulting tree has type int. */ + The resulting tree has type ptrdiff_t. */ static tree pointer_diff (location_t loc, tree op0, tree op1) { tree restype = ptrdiff_type_node; tree result, inttype; addr_space_t as0 = TYPE_ADDR_SPACE (TREE_TYPE (TREE_TYPE (op0))); addr_space_t as1 = TYPE_ADDR_SPACE (TREE_TYPE (TREE_TYPE (op1))); tree target_type = TREE_TYPE (TREE_TYPE (op0)); @@ -3824,23 +3824,21 @@ pointer_diff (location_t loc, tree op0, "pointer of type % used in subtraction"); if (TREE_CODE (target_type) == FUNCTION_TYPE) pedwarn (loc, OPT_Wpointer_arith, "pointer to a function used in subtraction"); /* First do the subtraction as integers; then drop through to build the divide operator. Do not do default conversions on the minus operator in case restype is a short type. */ - op0 = build_binary_op (loc, - MINUS_EXPR, convert (inttype, op0), - convert (inttype, op1), false); + op0 = build2_loc (loc, POINTER_DIFF_EXPR, inttype, op0, op1); /* This generates an error if op1 is pointer to incomplete type. */ if (!COMPLETE_OR_VOID_TYPE_P (TREE_TYPE (TREE_TYPE (orig_op1 error_at (loc, "arithmetic on pointer to an incomplete type"); op1 = c_size_in_bytes (target_type); if (pointer_to_zero_sized_aggr_p (TREE_TYPE (orig_op1))) error_at (loc, "arithmetic on pointer to an empty aggregate"); /* Divide by the size, in easiest possible way. */ Index: gcc/c-family/c-pretty-print.c === --- gcc/c-family/c-pretty-print.c (revision 253536) +++ gcc/c-family/c-pretty-print.c (working copy) @@ -1869,20 +1869,21 @@ c_pretty_printer::multiplicative_express
Re: [RFC PATCH] Fix pointer diff (was: -fsanitize=pointer-overflow support (PR sanitizer/80998))
On Sat, 1 Jul 2017, Marc Glisse wrote: > On Thu, 22 Jun 2017, Richard Biener wrote: > > > On Thu, 22 Jun 2017, Marc Glisse wrote: > > > > > On Thu, 22 Jun 2017, Richard Biener wrote: > > > > > > > > If we consider pointers as unsigned, with a subtraction that has a > > > > > signed > > > > > result with the constraint that overflow is undefined, we cannot model > > > > > that > > > > > optimally with just the usual signed/unsigned operations, so I am in > > > > > favor > > > > > of > > > > > POINTER_DIFF, at least in the long run (together with having a signed > > > > > second > > > > > argument for POINTER_PLUS, etc). For 64-bit platforms it might have > > > > > been > > > > > easier to declare that the upper half (3/4 ?) of the address space > > > > > doesn't > > > > > exist... > > > > > > > > I repeatedly thought of POINTER_DIFF_EXPR but adding such a basic tree > > > > code is quite a big job. > > > > > > Yes :-( > > > It is probably not realistic to introduce it just to avoid a couple > > > regressions while fixing a bug. > > > > > > > So we'd have POINTER_DIFF_EXPR take two pointer typed args and produce > > > > ptrdiff_t. What's the advantage of having this? > > > > > > It represents q-p with one statement instead of 3 (long)q-(long)p or 4 > > > (long)((ulong)q-(ulong)p). It allows us to stay in the pointer world, so > > > (q-p)>0 is equivalent to p> > what (undefined) overflow means for pointers. > > > > > > Of course it is hard to know in advance if that's significant or > > > negligible, maybe size_t finds its way in too many places anyway. > > > > As with all those experiments ... > > > > Well, if I would sell this as a consultant to somebody I'd estimate > > 3 man months for this work which realistically means you have to > > start now otherwise you won't make it this stage 1. > > I wrote a quick prototype to see what the fallout would look like. > Surprisingly, it actually passes bootstrap+testsuite on ppc64el with all > languages with no regression. Sure, it is probably not a complete > migration, there are likely a few places still converting to ptrdiff_t > to perform a regular subtraction, but this seems to indicate that the > work isn't as bad as using a signed type in pointer_plus_expr for > instance. The fold_binary_loc hunk looks dangerous (it'll generate MINUS_EXPR from POINTER_MINUS_EXPR in some cases I guess). The tree code needs documenting in tree.def and generic.texi. Otherwise ok(*). Thanks, Richard. (*) ok, just kidding -- or maybe not
Re: [RFC PATCH] Fix pointer diff (was: -fsanitize=pointer-overflow support (PR sanitizer/80998))
On Thu, 22 Jun 2017, Richard Biener wrote: On Thu, 22 Jun 2017, Marc Glisse wrote: On Thu, 22 Jun 2017, Richard Biener wrote: If we consider pointers as unsigned, with a subtraction that has a signed result with the constraint that overflow is undefined, we cannot model that optimally with just the usual signed/unsigned operations, so I am in favor of POINTER_DIFF, at least in the long run (together with having a signed second argument for POINTER_PLUS, etc). For 64-bit platforms it might have been easier to declare that the upper half (3/4 ?) of the address space doesn't exist... I repeatedly thought of POINTER_DIFF_EXPR but adding such a basic tree code is quite a big job. Yes :-( It is probably not realistic to introduce it just to avoid a couple regressions while fixing a bug. So we'd have POINTER_DIFF_EXPR take two pointer typed args and produce ptrdiff_t. What's the advantage of having this? It represents q-p with one statement instead of 3 (long)q-(long)p or 4 (long)((ulong)q-(ulong)p). It allows us to stay in the pointer world, so (q-p)>0 is equivalent to p
Re: [RFC PATCH] Fix pointer diff (was: -fsanitize=pointer-overflow support (PR sanitizer/80998))
On Thu, 22 Jun 2017, Marc Glisse wrote: > On Thu, 22 Jun 2017, Richard Biener wrote: > > > > If we consider pointers as unsigned, with a subtraction that has a signed > > > result with the constraint that overflow is undefined, we cannot model > > > that > > > optimally with just the usual signed/unsigned operations, so I am in favor > > > of > > > POINTER_DIFF, at least in the long run (together with having a signed > > > second > > > argument for POINTER_PLUS, etc). For 64-bit platforms it might have been > > > easier to declare that the upper half (3/4 ?) of the address space doesn't > > > exist... > > > > I repeatedly thought of POINTER_DIFF_EXPR but adding such a basic tree > > code is quite a big job. > > Yes :-( > It is probably not realistic to introduce it just to avoid a couple > regressions while fixing a bug. > > > So we'd have POINTER_DIFF_EXPR take two pointer typed args and produce > > ptrdiff_t. What's the advantage of having this? > > It represents q-p with one statement instead of 3 (long)q-(long)p or 4 > (long)((ulong)q-(ulong)p). It allows us to stay in the pointer world, so > (q-p)>0 is equivalent to pwhat (undefined) overflow means for pointers. > > Of course it is hard to know in advance if that's significant or > negligible, maybe size_t finds its way in too many places anyway. As with all those experiments ... Well, if I would sell this as a consultant to somebody I'd estimate 3 man months for this work which realistically means you have to start now otherwise you won't make it this stage 1. A smaller job would be to make POINTER_PLUS_EXPR take ptrdiff_t as offset operand. But the fallout is likely similar. A lame(?) half-way transition would allow for both unsigned and signed ptrdiff_t (note sizetype -> [u]ptrdiff_t is already a transition for some embedded archs eventually). Maybe allowing both signed and unsigned offsets is desirable (you of course get interesting effects when combining those in GIMPLE where signedness matters). Richard. > > And yes, I agree that POINTER_PLUS_EXPR should take > > ptrdiff_t rather than sizetype offset -- changing one without the > > other will lead to awkwardness in required patterns involving > > both like (p - q) + q. > > > > As said, it's a big job with likely all sorts of (testsuite) fallout. > > > > > > The third one is > > > >if ([b] - [c] != b - c) > > > >link_error(); > > > > where fold already during generic folding used to be able to cope with > > > > it, > > > > but now we have: > > > > (long int) (((long unsigned int) b - (long unsigned int) c) * 4) /[ex] 4 > > > > != > > > > b - c > > > > which we don't fold. > > > > > > Once we have this last expression, we have lost, we need to know that the > > > multiplication cannot overflow for this. When the size multiplications are > > > done in a signed type in the future (?), it might help. > > > > Not sure where the unsigned multiply comes from -- I guess we fold > > it inside the cast ... > > We usually do those multiplications in an unsigned type. I experimented > with changing one such place in > https://gcc.gnu.org/ml/gcc-patches/2017-05/msg01641.html , there is > probably at least another one in the middle-end. > > -- Richard BienerSUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
Re: [RFC PATCH] Fix pointer diff (was: -fsanitize=pointer-overflow support (PR sanitizer/80998))
On Thu, 22 Jun 2017, Richard Biener wrote: If we consider pointers as unsigned, with a subtraction that has a signed result with the constraint that overflow is undefined, we cannot model that optimally with just the usual signed/unsigned operations, so I am in favor of POINTER_DIFF, at least in the long run (together with having a signed second argument for POINTER_PLUS, etc). For 64-bit platforms it might have been easier to declare that the upper half (3/4 ?) of the address space doesn't exist... I repeatedly thought of POINTER_DIFF_EXPR but adding such a basic tree code is quite a big job. Yes :-( It is probably not realistic to introduce it just to avoid a couple regressions while fixing a bug. So we'd have POINTER_DIFF_EXPR take two pointer typed args and produce ptrdiff_t. What's the advantage of having this? It represents q-p with one statement instead of 3 (long)q-(long)p or 4 (long)((ulong)q-(ulong)p). It allows us to stay in the pointer world, so (q-p)>0 is equivalent to p
Re: [RFC PATCH] Fix pointer diff (was: -fsanitize=pointer-overflow support (PR sanitizer/80998))
On Wed, 21 Jun 2017, Marc Glisse wrote: > On Wed, 21 Jun 2017, Jakub Jelinek wrote: > > > So, I wrote following patch to do the subtraction in unsigned > > type. It passes bootstrap, but on both x86_64-linux and i686-linux > > regresses: > > +FAIL: gcc.dg/torture/pr66178.c -O* (test for excess errors) > > +FAIL: gcc.dg/tree-ssa/cmpexactdiv-2.c scan-tree-dump-not optimized > > "minus_expr" > > +FAIL: g++.dg/tree-ssa/pr21082.C -std=gnu++* (test for excess errors) > > > > E.g. in the first testcase we have in the test: > > static uintptr_t a = ((char *)&(char *)&)+((char *)&(char > > *)&); > > Without the patch, we ended up with: > > static uintptr_t a = (uintptr_t) (((long int) - (long int) ) + ((long > > int) - (long int) )); > > but with the patch with (the negation in signed type sounds like a folding > > bug), which is too difficult for the initializer_constant_valid_p* handling: > > (uintptr_t) (((long unsigned int) -(long int) - (long unsigned int) ) > > + ((long unsigned int) + (long unsigned int) )); > > Shall we just xfail that test, or make sure we don't reassociate such > > subtractions, something different? > > Adding to match.pd a few more simple reassoc transformations (like > (c-b)+(b-a) for instance) that work for both signed and unsigned is on > my TODO-list, though that may not be enough. Maybe together with fixing > whatever produced the negation would suffice? I guess try to fix the negation and see if that magically fixes things... > > The second failure is on: > > int f (long *a, long *b, long *c) { > >__PTRDIFF_TYPE__ l1 = b - a; > >__PTRDIFF_TYPE__ l2 = c - a; > >return l1 < l2; > > } > > where without the patch during forwprop2 we optimize it > > using match.pd: > > /* X - Z < Y - Z is the same as X < Y when there is no overflow. */ > > because we had: > > b.0_1 = (long int) b_8(D); > > a.1_2 = (long int) a_9(D); > > _3 = b.0_1 - a.1_2; > > c.2_4 = (long int) c_11(D); > > a.3_5 = (long int) a_9(D); > > _6 = c.2_4 - a.3_5; > > _7 = _3 < _6; > > But with the patch we have: > > b.0_1 = (long unsigned int) b_9(D); > > a.1_2 = (long unsigned int) a_10(D); > > _3 = b.0_1 - a.1_2; > > _4 = (long int) _3; > > c.2_5 = (long unsigned int) c_11(D); > > _6 = c.2_5 - a.1_2; > > _7 = (long int) _6; > > _8 = _4 < _7; > > instead. But that is something we can't generally optimize. > > So do we need to introduce POINTER_DIFF (where we could still > > optimize this) or remove the test? If we rely on largest possible > > array to be half of the VA size - 1 (i.e. where for x > y both being > > pointers into the same array x - y > 0), then it is a valid optimization > > of the 2 pointer subtractions, but it is not a valid optimization on > > comparison of unsigned subtractions cast to signed type. > > (this testcase was meant as a simpler version of > vector.size() < vector.capacity() ) > > It does indeed seem impossible to do this optimization with the unsigned > pointer subtraction. Yep :/ This is the issue with all the non-pointer integer folding fixes as well -- as soon as we introduce unsigned ops for the fear of introducing undefined overflow we lose on followup optimization opportunities. > If we consider pointers as unsigned, with a subtraction that has a signed > result with the constraint that overflow is undefined, we cannot model that > optimally with just the usual signed/unsigned operations, so I am in favor of > POINTER_DIFF, at least in the long run (together with having a signed second > argument for POINTER_PLUS, etc). For 64-bit platforms it might have been > easier to declare that the upper half (3/4 ?) of the address space doesn't > exist... I repeatedly thought of POINTER_DIFF_EXPR but adding such a basic tree code is quite a big job. So we'd have POINTER_DIFF_EXPR take two pointer typed args and produce ptrdiff_t. What's the advantage of having this? And yes, I agree that POINTER_PLUS_EXPR should take ptrdiff_t rather than sizetype offset -- changing one without the other will lead to awkwardness in required patterns involving both like (p - q) + q. As said, it's a big job with likely all sorts of (testsuite) fallout. > > The third one is > >if ([b] - [c] != b - c) > >link_error(); > > where fold already during generic folding used to be able to cope with it, > > but now we have: > > (long int) (((long unsigned int) b - (long unsigned int) c) * 4) /[ex] 4 != > > b - c > > which we don't fold. > > Once we have this last expression, we have lost, we need to know that the > multiplication cannot overflow for this. When the size multiplications are > done in a signed type in the future (?), it might help. Not sure where the unsigned multiply comes from -- I guess we fold it inside the cast ... > On the other hand, is this an important optimization? I am surprised we are > only doing this transformation in generic (so some hack in the front-end could > still work), it shouldn't be hard to implement some
Re: [RFC PATCH] Fix pointer diff (was: -fsanitize=pointer-overflow support (PR sanitizer/80998))
On Wed, 21 Jun 2017, Jakub Jelinek wrote: So, I wrote following patch to do the subtraction in unsigned type. It passes bootstrap, but on both x86_64-linux and i686-linux regresses: +FAIL: gcc.dg/torture/pr66178.c -O* (test for excess errors) +FAIL: gcc.dg/tree-ssa/cmpexactdiv-2.c scan-tree-dump-not optimized "minus_expr" +FAIL: g++.dg/tree-ssa/pr21082.C -std=gnu++* (test for excess errors) E.g. in the first testcase we have in the test: static uintptr_t a = ((char *)&(char *)&)+((char *)&(char *)&); Without the patch, we ended up with: static uintptr_t a = (uintptr_t) (((long int) - (long int) ) + ((long int) - (long int) )); but with the patch with (the negation in signed type sounds like a folding bug), which is too difficult for the initializer_constant_valid_p* handling: (uintptr_t) (((long unsigned int) -(long int) - (long unsigned int) ) + ((long unsigned int) + (long unsigned int) )); Shall we just xfail that test, or make sure we don't reassociate such subtractions, something different? Adding to match.pd a few more simple reassoc transformations (like (c-b)+(b-a) for instance) that work for both signed and unsigned is on my TODO-list, though that may not be enough. Maybe together with fixing whatever produced the negation would suffice? The second failure is on: int f (long *a, long *b, long *c) { __PTRDIFF_TYPE__ l1 = b - a; __PTRDIFF_TYPE__ l2 = c - a; return l1 < l2; } where without the patch during forwprop2 we optimize it using match.pd: /* X - Z < Y - Z is the same as X < Y when there is no overflow. */ because we had: b.0_1 = (long int) b_8(D); a.1_2 = (long int) a_9(D); _3 = b.0_1 - a.1_2; c.2_4 = (long int) c_11(D); a.3_5 = (long int) a_9(D); _6 = c.2_4 - a.3_5; _7 = _3 < _6; But with the patch we have: b.0_1 = (long unsigned int) b_9(D); a.1_2 = (long unsigned int) a_10(D); _3 = b.0_1 - a.1_2; _4 = (long int) _3; c.2_5 = (long unsigned int) c_11(D); _6 = c.2_5 - a.1_2; _7 = (long int) _6; _8 = _4 < _7; instead. But that is something we can't generally optimize. So do we need to introduce POINTER_DIFF (where we could still optimize this) or remove the test? If we rely on largest possible array to be half of the VA size - 1 (i.e. where for x > y both being pointers into the same array x - y > 0), then it is a valid optimization of the 2 pointer subtractions, but it is not a valid optimization on comparison of unsigned subtractions cast to signed type. (this testcase was meant as a simpler version of vector.size() < vector.capacity() ) It does indeed seem impossible to do this optimization with the unsigned pointer subtraction. If we consider pointers as unsigned, with a subtraction that has a signed result with the constraint that overflow is undefined, we cannot model that optimally with just the usual signed/unsigned operations, so I am in favor of POINTER_DIFF, at least in the long run (together with having a signed second argument for POINTER_PLUS, etc). For 64-bit platforms it might have been easier to declare that the upper half (3/4 ?) of the address space doesn't exist... The third one is if ([b] - [c] != b - c) link_error(); where fold already during generic folding used to be able to cope with it, but now we have: (long int) (((long unsigned int) b - (long unsigned int) c) * 4) /[ex] 4 != b - c which we don't fold. Once we have this last expression, we have lost, we need to know that the multiplication cannot overflow for this. When the size multiplications are done in a signed type in the future (?), it might help. On the other hand, is this an important optimization? I am surprised we are only doing this transformation in generic (so some hack in the front-end could still work), it shouldn't be hard to implement some subset of fold_addr_of_array_ref_difference in match.pd (it is recursive so a complete move may be harder). But that would make your patch break even more stuff :-( -- Marc Glisse
Re: [RFC PATCH] Fix pointer diff (was: -fsanitize=pointer-overflow support (PR sanitizer/80998))
On Wed, Jun 21, 2017 at 04:40:01PM +0200, Jakub Jelinek wrote: > So, I wrote following patch to do the subtraction in unsigned > type. It passes bootstrap, but on both x86_64-linux and i686-linux > regresses: > +FAIL: gcc.dg/torture/pr66178.c -O* (test for excess errors) > +FAIL: gcc.dg/tree-ssa/cmpexactdiv-2.c scan-tree-dump-not optimized > "minus_expr" > +FAIL: g++.dg/tree-ssa/pr21082.C -std=gnu++* (test for excess errors) Another option is to do what the patch does only when sanitizing and accept in that case less efficient code and rejection of weird corner case testcases like the first one. We risk miscompilation of the pointer differences, but I haven't managed to come up with a testcase where it would show (I guess more likely is when we propagate constants into the pointers). Jakub
[RFC PATCH] Fix pointer diff (was: -fsanitize=pointer-overflow support (PR sanitizer/80998))
On Tue, Jun 20, 2017 at 10:18:20AM +0200, Richard Biener wrote: > > > > 3) not really related to this patch, but something I also saw during the > > > > bootstrap-ubsan on i686-linux: > > > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: > > > > -2147426384 - 2147475412 cannot be represented in type 'int' > > > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: > > > > -2147426384 - 2147478324 cannot be represented in type 'int' > > > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: > > > > -2147450216 - 2147451580 cannot be represented in type 'int' > > > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: > > > > -2147450216 - 2147465664 cannot be represented in type 'int' > > > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: > > > > -2147469348 - 2147451544 cannot be represented in type 'int' > > > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: > > > > -2147482364 - 2147475376 cannot be represented in type 'int' > > > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: > > > > -2147483624 - 2147475376 cannot be represented in type 'int' > > > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: > > > > -2147483628 - 2147451544 cannot be represented in type 'int' > > > > ../../gcc/memory-block.cc:59:4: runtime error: signed integer overflow: > > > > -2147426384 - 2147475376 cannot be represented in type 'int' > > > > ../../gcc/memory-block.cc:59:4: runtime error: signed integer overflow: > > > > -2147450216 - 2147451544 cannot be represented in type 'int' > > > > The problem here is that we lower pointer subtraction, e.g. > > > > long foo (char *p, char *q) { return q - p; } > > > > as return (ptrdiff_t) ((ssizetype) q - (ssizetype) p); > > > > and even for a valid testcase where we have an array across > > > > the middle of the virtual address space, say the first one above > > > > is (char *) 0x8000dfb0 - (char *) 0x7fffdfd4 subtraction, even if > > > > there is 128KB array starting at 0x7fffd000, it will yield > > > > UB (not in the source, but in whatever the compiler lowered it into). > > > > So, shall we instead do the subtraction in sizetype and only then > > > > cast? For sizeof (*ptr) > 1 I think we have some outstanding PR, > > > > and it is more difficult to find out in what types to compute it. > > > > Or do we want to introduce POINTER_DIFF_EXPR? > > > > > > Just use uintptr_t for the difference computation (well, an unsigned > > > integer type of desired precision -- mind address-spaces), then cast > > > the result to signed. > > > > Ok (of course, will handle this separately from the rest). > > Yes. Note I didn't look at the actual patch (yet). So, I wrote following patch to do the subtraction in unsigned type. It passes bootstrap, but on both x86_64-linux and i686-linux regresses: +FAIL: gcc.dg/torture/pr66178.c -O* (test for excess errors) +FAIL: gcc.dg/tree-ssa/cmpexactdiv-2.c scan-tree-dump-not optimized "minus_expr" +FAIL: g++.dg/tree-ssa/pr21082.C -std=gnu++* (test for excess errors) E.g. in the first testcase we have in the test: static uintptr_t a = ((char *)&(char *)&)+((char *)&(char *)&); Without the patch, we ended up with: static uintptr_t a = (uintptr_t) (((long int) - (long int) ) + ((long int) - (long int) )); but with the patch with (the negation in signed type sounds like a folding bug), which is too difficult for the initializer_constant_valid_p* handling: (uintptr_t) (((long unsigned int) -(long int) - (long unsigned int) ) + ((long unsigned int) + (long unsigned int) )); Shall we just xfail that test, or make sure we don't reassociate such subtractions, something different? The second failure is on: int f (long *a, long *b, long *c) { __PTRDIFF_TYPE__ l1 = b - a; __PTRDIFF_TYPE__ l2 = c - a; return l1 < l2; } where without the patch during forwprop2 we optimize it using match.pd: /* X - Z < Y - Z is the same as X < Y when there is no overflow. */ because we had: b.0_1 = (long int) b_8(D); a.1_2 = (long int) a_9(D); _3 = b.0_1 - a.1_2; c.2_4 = (long int) c_11(D); a.3_5 = (long int) a_9(D); _6 = c.2_4 - a.3_5; _7 = _3 < _6; But with the patch we have: b.0_1 = (long unsigned int) b_9(D); a.1_2 = (long unsigned int) a_10(D); _3 = b.0_1 - a.1_2; _4 = (long int) _3; c.2_5 = (long unsigned int) c_11(D); _6 = c.2_5 - a.1_2; _7 = (long int) _6; _8 = _4 < _7; instead. But that is something we can't generally optimize. So do we need to introduce POINTER_DIFF (where we could still optimize this) or remove the test? If we rely on largest possible array to be half of the VA size - 1 (i.e. where for x > y both being pointers into the same array x - y > 0), then it is a valid optimization of the 2 pointer subtractions, but it is not a valid optimization on comparison of unsigned subtractions cast to signed type. The third one is if ([b] - [c] != b - c)