[Bug c/63303] Pointer subtraction is broken when using -fsanitize=undefined
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63303 --- Comment #20 from Joshua Green --- > "But if we don't know which pointer is greater, it gets more complicated: > ..." > > I'm not sure that this is true. For types that are larger than 1 byte, it > seems that one can do the subtraction after any division(s), hence only > costing an additional division (or shift): > > T * p; > T * q; > > . > . > . > > intptr_t pVal = ((uintptr_t) p)/(sizeof *p); > intptr_t qVal = ((uintptr_t) q)/(sizeof *q); > > ptrdiff_t p_q = pVal - qVal; > > This should work in well-defined cases, for if p and q are pointers into the > same array then (presumably) ((uintptr_t) p) and ((uintptr_t) q) must have > the same remainder modulo sizeof(T). > > Of course, even an additional shift may be too expensive in some cases, so > it's not entirely clear that this change should be made. It occurred to me that such contortions can be avoided in the (possibly) common case when (say) q is actually an array.
[Bug c/63303] Pointer subtraction is broken when using -fsanitize=undefined
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63303 --- Comment #19 from Marc Glisse --- This seems fixed in gcc-8, no?
[Bug c/63303] Pointer subtraction is broken when using -fsanitize=undefined
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63303 --- Comment #18 from joseph at codesourcery dot com --- On Tue, 21 Aug 2018, jvg1981 at aim dot com wrote: > intptr_t pVal = ((uintptr_t) p)/(sizeof *p); > intptr_t qVal = ((uintptr_t) q)/(sizeof *q); Note that this can be expanded like an exact division (right shift and multiply by reciprocal). See bug 67999 comment 24 and bug 81801 comment 4.
[Bug c/63303] Pointer subtraction is broken when using -fsanitize=undefined
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63303 Joshua Green changed: What|Removed |Added CC||jvg1981 at aim dot com --- Comment #17 from Joshua Green --- "But if we don't know which pointer is greater, it gets more complicated: ..." I'm not sure that this is true. For types that are larger than 1 byte, it seems that one can do the subtraction after any division(s), hence only costing an additional division (or shift): T * p; T * q; . . . intptr_t pVal = ((uintptr_t) p)/(sizeof *p); intptr_t qVal = ((uintptr_t) q)/(sizeof *q); ptrdiff_t p_q = pVal - qVal; This should work in well-defined cases, for if p and q are pointers into the same array then (presumably) ((uintptr_t) p) and ((uintptr_t) q) must have the same remainder modulo sizeof(T). Of course, even an additional shift may be too expensive in some cases, so it's not entirely clear that this change should be made.
[Bug c/63303] Pointer subtraction is broken when using -fsanitize=undefined
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63303 --- Comment #16 from Alexander Cherepanov --- On 2015-11-23 14:58, rguenth at gcc dot gnu.org wrote: > Note that in practice it needs exposal of the address constant to trigger the > bogus optimization. No. The program: #include #include #include #include int main() { // make sure our allocation will cross 0x8000 boundary // but still will not violate limits of gcc size_t len = 0x7fff; char *p1 = malloc(len); char *volatile v = p1 + len; char *p2 = v; long l1 = (long)(uintptr_t)p1; long l2 = (long)(uintptr_t)p2; if (l2 < l1) { long z = p2 - p1; if (z < 0) printf("%ld\n", z); } } prints a positive number for me even though it guards against it. gcc 5.2 with -m32 -O2.
[Bug c/63303] Pointer subtraction is broken when using -fsanitize=undefined
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63303 Richard Biener changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2015-11-23 Ever confirmed|0 |1 --- Comment #15 from Richard Biener --- Note that in practice it needs exposal of the address constant to trigger the bogus optimization. Note that the IL from the frontend is indeed the one to blame here: char * p = (char *) mmap (2147479552B, 8192, 3, 50, -1, 0); char * q = 2147483647B; if ((int) (p + 4096) - (int) q > 0) { for correctness WRT undefined overflow we need to do the subtraction in unsigned arithmetic and then interpret the result as signed (for an eventual division by element size). Same issue in the C++ frontend btw.
[Bug c/63303] Pointer subtraction is broken when using -fsanitize=undefined
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63303 --- Comment #14 from Florian Weimer --- (In reply to Szabolcs Nagy from comment #13) > if gcc treats p-q as (ssize_t)p-(ssize_t)q and makes > optimization decisions based on signed int range then > that's broken and leads to wrong code gen. Thanks for the test case. I think the remedy proposed so far (glibc should block allocations sized half of the address space and larger) is insufficient.
[Bug c/63303] Pointer subtraction is broken when using -fsanitize=undefined
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63303 Szabolcs Nagy changed: What|Removed |Added CC||nszabolcs at gmail dot com --- Comment #13 from Szabolcs Nagy --- if gcc treats p-q as (ssize_t)p-(ssize_t)q and makes optimization decisions based on signed int range then that's broken and leads to wrong code gen. e.g. gcc optimizes if(n - 0x7fff > 0).. away (but not if(-0x7fff-1 - n > 0), but that's another bug), so $ cat bug.c #include int main() { char *p = mmap((void*)(0x8000-4096), 2*4096, PROT_READ|PROT_WRITE, MAP_FIXED|MAP_PRIVATE|MAP_ANONYMOUS, -1, 0); char *q = (void*)(0x7fff); // p+4095 if ((p+4096) - q > 0) return 0; // wrongly optimized away return 1; } $ gcc-5.2-i386 -fomit-frame-pointer -fno-asynchronous-unwind-tables -O3 -S bug.c $ cat bug.s .file "bug.c" .section.text.unlikely,"ax",@progbits .LCOLDB0: .section.text.startup,"ax",@progbits .LHOTB0: .p2align 2,,3 .globl main .type main, @function main: leal4(%esp), %ecx andl$-16, %esp pushl -4(%ecx) pushl %ebp movl%esp, %ebp pushl %ecx subl$8, %esp pushl $0 pushl $0 pushl $-1 pushl $50 pushl $3 pushl $8192 pushl $2147479552 callmmap addl$32, %esp movl$1, %eax movl-4(%ebp), %ecx leave leal-4(%ecx), %esp ret .size main, .-main .section.text.unlikely .LCOLDE0: .section.text.startup .LHOTE0: .ident "GCC: (GNU) 5.2.0" .section.note.GNU-stack,"",@progbits after the mmap call %eax is unconditionally set to 1. at runtime the mmap succeeds and the returned object crosses the 0x8000 boundary, so the return value is incorrect. (i found this bug report after incorrectly getting SIGILL at ptrdiffs with -fsanitize=undefined -fsanitize-undefined-trap-on-error )
[Bug c/63303] Pointer subtraction is broken when using -fsanitize=undefined
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63303 --- Comment #12 from Richard Biener --- (In reply to Jakub Jelinek from comment #3) > The problem is that we don't have a POINTER_DIFF_EXPR similar to > POINTER_PLUS_EXPR, which would take two pointers and return an integer, and > the FEs emit pointer difference as cast of both the pointers to signed > integral type > and subtracts the integers. > If > ssize_t foo (char *p, char *q) { return p - q; } > is changed into > ssize_t foo (char *p, char *q) { return (ssize_t) p - (ssize_t) q; } > by the FE, then indeed if you have array starting at 0x7fff and ending > at 0x8001 and subtract those two pointers, you get undefined behavior. > That is undefined behavior not just for ubsan, but for anything else in the > middle-end. > So, if pointer difference is supposed to behave differently, then > we'd either need to represent pointer difference as > ssize_t foo (char *p, char *q) { return (ssize_t) ((size_t) p - (size_t) q); > } > (but we risk missed optimizations that way I'd say), or we'd need a better > representation of it in the middle-end. Note that apart from missing POINTER_DIFF this isn't a middle-end but a frontend issue.
[Bug c/63303] Pointer subtraction is broken when using -fsanitize=undefined
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63303 --- Comment #11 from mikulas at artax dot karlin.mff.cuni.cz --- Richard Biener: if the middle end tells us that one pointer is greater or equal than the other pointer, we could do unsigned subtraction and shift. But if we don't know which pointer is greater, it gets more complicated: To do correct short* pointer subtraction, we need to subtract pointers using sub %edx, %eax; rcr $1, %eax --- i.e. shift the carry bit back to the topmost bit of the result. According to Agner's tables, rcr with 1-bit count takes 1 tick on AMD and 2 ticks on Intel, so the performance penalty isn't that big. On other architectures that lack rcr, it would be more complicated. Another possibility is to file a defect report on the C standard and request that program in comment 4 be considered invalid. - for example, change the wording to this: "If the result multiplied by the size of the array element is not representable in an object of that type, the behavior is undefined." - that would specify that that subtraction is invalid.
[Bug c/63303] Pointer subtraction is broken when using -fsanitize=undefined
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63303 --- Comment #10 from Richard Biener --- To support the standards definition of p1 - p2 we'd need a POINTER_DIFF_EXPR that also embeds the exact division by the array element size. Btw, while C and C++ share pointer_int_sum they have differing implementations for computing pointer differences. The safe variant would be indeed to compute the pointer difference using an unsigned type and I can't see what optimizations we lose when doing that. Note that you'd still need to convert the result to a signed type before doing the exact division by the element size.
[Bug c/63303] Pointer subtraction is broken when using -fsanitize=undefined
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63303 --- Comment #9 from mikulas at artax dot karlin.mff.cuni.cz --- > See what I wrote, any object size bigger than half of address space really > isn't supportable, because then (char *) (P) - (char *) (Q) might not fit into > ptrdiff_t. There is no point slowing down all pointer subtractions (other > than > char/signed char/unsigned char pointers) for something that really wouldn't > work reliably anyway. But the code in comment 4 doesn't perform (char *)P - (char *)Q. It performs (short *)P - (short *)Q. And that result clearly fits into the signed ptrdiff_t type. If the code in comment 4 performed (char *)b - (char *)a, that operation would be invalid because of overflow. But it doesn't.
[Bug c/63303] Pointer subtraction is broken when using -fsanitize=undefined
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63303 --- Comment #8 from Jakub Jelinek --- (In reply to mikulas from comment #6) > Regarding pointer difference, the C standard says this: > > When two pointers are subtracted, both shall point to elements of the same > array object, or one past the last element of the array object; the result > is the difference of the subscripts of the two array elements. The size of > the result is implementation-defined, and its type (a signed integer type) > is ptrdiff_t defined in the header. If the result is not > representable in an object of that type, the behavior is undefined. In other > words, if the expressions P and Q point to, respectively, the i-th and j-th > elements of an array object, the expression (P)-(Q) has the value i−j > provided the value fits in an object of type ptrdiff_t. > > So: p points to the beginning, q points one past the last element, so the > first condition is valid. > > The result is the difference of the subscripts of those two array elements: > 0x5000 - 0 = 0x5000 - this is clearly representable in the type > ptrdiff_t, so 0x5000 result should be returned. See what I wrote, any object size bigger than half of address space really isn't supportable, because then (char *) (P) - (char *) (Q) might not fit into ptrdiff_t. There is no point slowing down all pointer subtractions (other than char/signed char/unsigned char pointers) for something that really wouldn't work reliably anyway.
[Bug c/63303] Pointer subtraction is broken when using -fsanitize=undefined
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63303 --- Comment #7 from joseph at codesourcery dot com --- Yes, I consider it a bug in malloc that it produces objects 2GB or more in size on 32-bit systems (because of the one-past-end address, the largest size that can't produce undefined behavior in the user program is 2GB minus one byte). Unfortunately I expect some 32-bit applications rely on such large allocations, so if we changed malloc (please report a bug to glibc Bugzilla) we'd need a way (feature test macro?) for people to continue to build programs to use the old malloc, as well as to avoid breaking existing binaries.
[Bug c/63303] Pointer subtraction is broken when using -fsanitize=undefined
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63303 --- Comment #6 from mikulas at artax dot karlin.mff.cuni.cz --- "you really can't have an object bigger than half of the address space in C/C++" - where does the standard claim this? If this is true, we should change malloc so that it doesn't allocate 2GiB or larger objects. Regarding pointer difference, the C standard says this: When two pointers are subtracted, both shall point to elements of the same array object, or one past the last element of the array object; the result is the difference of the subscripts of the two array elements. The size of the result is implementation-defined, and its type (a signed integer type) is ptrdiff_t defined in the header. If the result is not representable in an object of that type, the behavior is undefined. In other words, if the expressions P and Q point to, respectively, the i-th and j-th elements of an array object, the expression (P)-(Q) has the value i−j provided the value fits in an object of type ptrdiff_t. So: p points to the beginning, q points one past the last element, so the first condition is valid. The result is the difference of the subscripts of those two array elements: 0x5000 - 0 = 0x5000 - this is clearly representable in the type ptrdiff_t, so 0x5000 result should be returned.
[Bug c/63303] Pointer subtraction is broken when using -fsanitize=undefined
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63303 --- Comment #5 from Jakub Jelinek --- (In reply to mikulas from comment #4) > ... and another related problem (try this on 32-bit system): > > #include > #include > > int main(void) > { > short *a = malloc(0x5000 * sizeof(short)); > short *b = a + 0x5000; > printf("%ld\n", (long)(b - a)); > return 0; > } > > Here, the return value should be positive (0x5000), but it is negative. > IMHO, according to the C standard, this is program correct and positive > result should be returned. This testcase is invalid, you really can't have an object bigger than half of the address space in C/C++, pointer difference is signed ptrdiff_t and if you have larger object, you can't subtract arbitrary char pointers in it anymore. If you need more than 2GB in a single array, just use 64-bit system.
[Bug c/63303] Pointer subtraction is broken when using -fsanitize=undefined
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63303 --- Comment #4 from mikulas at artax dot karlin.mff.cuni.cz --- ... and another related problem (try this on 32-bit system): #include #include int main(void) { short *a = malloc(0x5000 * sizeof(short)); short *b = a + 0x5000; printf("%ld\n", (long)(b - a)); return 0; } Here, the return value should be positive (0x5000), but it is negative. IMHO, according to the C standard, this is program correct and positive result should be returned. The problem is that it is not easy to fix it without performance penalty and all compilers that I tried (gcc, clang, icc, suncc, opencc, nwcc) print negative result.
[Bug c/63303] Pointer subtraction is broken when using -fsanitize=undefined
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63303 Jakub Jelinek changed: What|Removed |Added CC||jason at gcc dot gnu.org, ||jsm28 at gcc dot gnu.org, ||rguenth at gcc dot gnu.org --- Comment #3 from Jakub Jelinek --- The problem is that we don't have a POINTER_DIFF_EXPR similar to POINTER_PLUS_EXPR, which would take two pointers and return an integer, and the FEs emit pointer difference as cast of both the pointers to signed integral type and subtracts the integers. If ssize_t foo (char *p, char *q) { return p - q; } is changed into ssize_t foo (char *p, char *q) { return (ssize_t) p - (ssize_t) q; } by the FE, then indeed if you have array starting at 0x7fff and ending at 0x8001 and subtract those two pointers, you get undefined behavior. That is undefined behavior not just for ubsan, but for anything else in the middle-end. So, if pointer difference is supposed to behave differently, then we'd either need to represent pointer difference as ssize_t foo (char *p, char *q) { return (ssize_t) ((size_t) p - (size_t) q); } (but we risk missed optimizations that way I'd say), or we'd need a better representation of it in the middle-end.
[Bug c/63303] Pointer subtraction is broken when using -fsanitize=undefined
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63303 --- Comment #2 from mikulas at artax dot karlin.mff.cuni.cz --- Jakub Jelinek: I know, but the problem happened in perfectly valid program. Suppose that you do: char *p = malloc(0x2000); - the allocator allocates the array at 0x7000. Then, you do: char *q = p + 0x2000; /* q is 0x9000, pointing to the end of the array */ long n = q - p; --- this triggers the warning, although it is perfectly valid operation. The above case is non-reproducible because it depends on the address returned from the allocator. I wrote the example code to trigger the warning deterministically.
[Bug c/63303] Pointer subtraction is broken when using -fsanitize=undefined
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63303 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #1 from Jakub Jelinek --- Your testcase is definitely not valid C, you can't perform pointer arithmetics in between pointers that don't point into the same array or one past the last element in the array.