[Bug c/63303] Pointer subtraction is broken when using -fsanitize=undefined

2018-08-26 Thread jvg1981 at aim dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63303

--- Comment #20 from Joshua Green  ---
> "But if we don't know which pointer is greater, it gets more complicated:
> ..."
> 
> I'm not sure that this is true.  For types that are larger than 1 byte, it
> seems that one can do the subtraction after any division(s), hence only
> costing an additional division (or shift):
> 
> T * p;
> T * q;
> 
> .
> .
> .
> 
> intptr_t pVal = ((uintptr_t) p)/(sizeof *p);
> intptr_t qVal = ((uintptr_t) q)/(sizeof *q);
> 
> ptrdiff_t p_q = pVal - qVal;
> 
> This should work in well-defined cases, for if p and q are pointers into the
> same array then (presumably) ((uintptr_t) p) and ((uintptr_t) q) must have
> the same remainder modulo sizeof(T).
> 
> Of course, even an additional shift may be too expensive in some cases, so
> it's not entirely clear that this change should be made.

It occurred to me that such contortions can be avoided in the (possibly) common
case when (say) q is actually an array.

[Bug c/63303] Pointer subtraction is broken when using -fsanitize=undefined

2018-08-21 Thread jvg1981 at aim dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63303

Joshua Green  changed:

   What|Removed |Added

 CC||jvg1981 at aim dot com

--- Comment #17 from Joshua Green  ---
"But if we don't know which pointer is greater, it gets more complicated: ..."

I'm not sure that this is true.  For types that are larger than 1 byte, it
seems that one can do the subtraction after any division(s), hence only costing
an additional division (or shift):

T * p;
T * q;

.
.
.

intptr_t pVal = ((uintptr_t) p)/(sizeof *p);
intptr_t qVal = ((uintptr_t) q)/(sizeof *q);

ptrdiff_t p_q = pVal - qVal;

This should work in well-defined cases, for if p and q are pointers into the
same array then (presumably) ((uintptr_t) p) and ((uintptr_t) q) must have the
same remainder modulo sizeof(T).

Of course, even an additional shift may be too expensive in some cases, so it's
not entirely clear that this change should be made.

[Bug tree-optimization/66573] Unexpected change in static, branch-prediction cost from O1 to O2 in if-then-else.

2015-11-24 Thread jvg1981 at aim dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66573

--- Comment #11 from Joshua Green  ---
(In reply to Segher Boessenkool from comment #10)
> GCC thinks bar2 will be executed more often that bar1; the code
> it generates is perfectly fine for that.
> 
> If you think GCC's heuristics for branch prediction are no good,
> could use some improvement, you'll have to come up with more
> evidence than just a single artificial testcase.  Sorry.  These
> things were tuned on real code.

If gcc's heuristic is indeed optimal when tested over a reasonable sample of
real code, then I withdraw my objection.

[Bug tree-optimization/66573] Unexpected change in static, branch-prediction cost from O1 to O2 in if-then-else.

2015-11-21 Thread jvg1981 at aim dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66573

--- Comment #7 from Joshua Green  ---
(In reply to Segher Boessenkool from comment #6)
> bb-reorder changes the conditional branch so that the fallthrough path 
> is the most likely.  It now also does this for -O1.  This is faster on
> essentially all processors, including the ones the OP mentions.
> 
> Without profiling information showing otherwise, GCC assumes the call
> to bar2 is more frequent than the one to bar1 (61% vs. 39%).  This is
> a heuristic, it might need retuning, but that needs a lot more data
> than this one testcase.
> 
> Closing as invalid.

While I agree that this isn't really a bug, I find the above reasoning hard to
follow.  The compiler could treat the original foo as

if (i) {
bar1();
} else {
bar2();
}

or
if (!i) {
bar2();
} else {
bar1();
}

and I see no reason why expecting the "else" block should a priori be
preferable in either case.  (It's also not clear HOW this could be "faster on
essentially all processors" in either case, though I'm open to being corrected
and/or enlightened on this subject.)  Of course, the compiler is free to make
whatever guess it wants, but it would be nice if the programmer had some
portable way of expressing his/her own expectations, and it seems that other
compilers provide that by "agreeing" to expect the "if" block (as, indeed,
various online articles recommend).

[Bug tree-optimization/66573] Unexpected change in static, branch-prediction cost from O1 to O2 in if-then-else.

2015-11-21 Thread jvg1981 at aim dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66573

--- Comment #9 from Joshua Green  ---
(In reply to Segher Boessenkool from comment #8)
> GCC does some fairly involved prediction (in predict.c).  It isn't
> "a priori".
> 
> > (It's also not clear HOW this could be "faster
> > on essentially all processors"
> 
> Fall-through is faster than branching in most cases.  Most CPUs have
> some kind of pipelining on instruction fetch.
> 

This is the point on which I'm confused.  I understand that fall through is
faster than branching, that it's good to keep the pipeline running smoothly. 
It seems to me, though, that in this case the compiler has complete freedom in
deciding which function call (bar1() or bar2()) is in the "fall through case"
and which is in the "branching case."  Why not make the same choice as other
compilers do (and documentation recommends, and O0 does [, and O1 used to do?])
by replacing the above O2-O3 code with

foo(bool):
testb   %dil, %dil
je  .L4
jmp bar1()
.L4:
jmp bar2()

?

[Bug tree-optimization/66573] Unexpected change in static, branch-prediction cost from O1 to O2 in if-then-else.

2015-11-19 Thread jvg1981 at aim dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66573

jvg1981 at aim dot com changed:

   What|Removed |Added

 CC||jvg1981 at aim dot com

--- Comment #5 from jvg1981 at aim dot com ---
I recently came across this surprising behavior.  Has anyone taken a serious
look at it?  Is it likely to be corrected/changed?