On Mon, May 13, 2019 at 3:38 PM Marc Glisse <marc.gli...@inria.fr> wrote: > > On Mon, 13 May 2019, Richard Biener wrote: > > > On Sun, May 12, 2019 at 2:51 PM Marc Glisse <marc.gli...@inria.fr> wrote: > >> > >> On Sun, 12 May 2019, Richard Sandiford wrote: > >> > >>> Marc Glisse <marc.gli...@inria.fr> writes: > >>>> Hello, > >>>> > >>>> this patch lets gcc know that if a pointer existed before the call to > >>>> malloc, the result of malloc cannot alias it. This is a bit ad hoc and > >>>> fragile. A small improvement would be, when the 2 statements are in the > >>>> same bb but in the wrong order, to check if there is any statement in > >>>> between that might prevent from reordering them. But that's more > >>>> complicated, and the patch as it is already does help. > >>>> > >>>> I expect people may not like the call to a function from > >>>> tree-ssa-loop-niter too much, but it is convenient. And if someone > >>>> improves it, they will likely have to rewrite something not quite > >>>> equivalent to stmt_dominates_stmt_p. > >>> > >>> It has linear complexity for statements in the same block though. > >> > >> Ah, true. I should anyway test that the second statement is malloc > >> (cheaper) before checking for domination, but even then that could be used > >> to create a quadratic behavior :-( > > > > I also think the oracle queries shouldn't encompany such expensive pieces... > > Well, it is only expensive because finding the order of statements in a > basic block is expensive. Would it be better if it only did this check if > we are in one of a very limited set of passes where statements have > reliable UIDs? Maybe that's not enough passes, loop-distribution uses UIDs > but I didn't check if it uses them in a way compatible with this use... > > What if I only check basic-block domination and punt when the statements > are in the same basic block? That seems cheap enough, and would already > work for the vector testcase. > > >>> (reassoc_stmt_dominates_stmt_p avoids that, but relies on uids > >>> being up-to-date.) > >> > >> This function is called from all over the place. Unless there is a simple > >> test to check if uids are safe to use (reassoc_in_progress?), that's going > >> to be a problem. I find it surprising that this information is so hard to > >> get to... Stopping stmt_dominates_stmt_p after walking 128 statements > >> doesn't feel like a great solution. But without controlling the pass where > >> this happens, I don't see any good solution. It would be great if that > >> non-aliasing property could be recorded in the points-to information, then > >> I could set it from a pass I control, but I somehow got the impression > >> that it wouldn't work. Maybe I should try to understand PTA better to make > >> sure. > > > > In princple PTA should know the aliasing cannot happen but possibly > > the info is either lost or the IL is too obfuscated at the point it gets > > computed. (hello C++...) > > We don't need much obfuscation for this, a simple C program > > int g; > int*f(int**p){ > int*q=*p; > int*r=__builtin_malloc(4); > *q=1; > *r=2; > g=*q; > return r; > } > > gives > > q_4 = *p_3(D); > r_6 = __builtin_malloc (4); > *q_4 = 1; > *r_6 = 2; > _1 = *q_4; > g = _1; > return r_6; > > where we clearly don't manage to prove that q and r don't alias.
We can probably improve this one in general from inside PTA by treating escapes through return specially. I wasn't looking too closely at the C++ testcase but I simply assumed the addition to ESCAPED happens through storing the malloc result to memory that PTA cannot prove local. > > So at ldist time I see > > > > # PT = null { D.47989 } (escaped, escaped heap) > > # ALIGN = 8, MISALIGN = 0 > > # USE = nonlocal null { D.47989 } (escaped, escaped heap) > > # CLB = nonlocal null { D.47989 } (escaped, escaped heap) > > _27 = malloc (8192); > > > > good. The interesting loop is the following where the allocation PTA > > is good and _4 intersect because they include ESCAPED. This is > > because _27 itself escapes and PTA not being flow-sensitive. I don't > > see an easy way to improve things here but dislike the stmt walking > > very much. That is, the proper solution is to re-write PTA in some way. > > I am trying to imagine what that might look like. I imagine that instead > of having a single "escaped" set, we would have multiple escaped sets at > different points in the function (at most one per VDEF?). Then _4 would > only contain escaped3 while heap5 (*_27) only appears in escaped9 and > later? It may be tricky to keep a linear-ish complexity with anything more > powerful than the current PTA. I don't know what LLVM is doing, but they > seem to manage. IIRC LLVM uses Steensgard analysis which is flow-sensitive but generally less powerful. We are using Andersen constraint-based analysis because Steensgard was patented back in time (that patent has now expired). One approach to make PTA "more" flow-sensitive is to partition the function into regions (basically represent the function as a series of function calls to its parts). For the issue above there's the long-standing issue of splitting escapes from function return from escapes to functions called to properly represent local variable lifetime. That said, your simpler patch still relies on up-to-date dominators, sth not guaranteed or required previously. We might consider implementing a separate (region-based) analysis adjusting/pruning points-to information with flow-sensitive knowlege. This might even work when done from inside PTA as post-processing. Richard. > > -- > Marc Glisse