On Mon, 13 May 2019, Richard Biener wrote:

On Sun, May 12, 2019 at 2:51 PM Marc Glisse <marc.gli...@inria.fr> wrote:

On Sun, 12 May 2019, Richard Sandiford wrote:

Marc Glisse <marc.gli...@inria.fr> writes:
Hello,

this patch lets gcc know that if a pointer existed before the call to
malloc, the result of malloc cannot alias it. This is a bit ad hoc and
fragile. A small improvement would be, when the 2 statements are in the
same bb but in the wrong order, to check if there is any statement in
between that might prevent from reordering them. But that's more
complicated, and the patch as it is already does help.

I expect people may not like the call to a function from
tree-ssa-loop-niter too much, but it is convenient. And if someone
improves it, they will likely have to rewrite something not quite
equivalent to stmt_dominates_stmt_p.

It has linear complexity for statements in the same block though.

Ah, true. I should anyway test that the second statement is malloc
(cheaper) before checking for domination, but even then that could be used
to create a quadratic behavior :-(

I also think the oracle queries shouldn't encompany such expensive pieces...

Well, it is only expensive because finding the order of statements in a basic block is expensive. Would it be better if it only did this check if we are in one of a very limited set of passes where statements have reliable UIDs? Maybe that's not enough passes, loop-distribution uses UIDs but I didn't check if it uses them in a way compatible with this use...

What if I only check basic-block domination and punt when the statements are in the same basic block? That seems cheap enough, and would already work for the vector testcase.

(reassoc_stmt_dominates_stmt_p avoids that, but relies on uids
being up-to-date.)

This function is called from all over the place. Unless there is a simple
test to check if uids are safe to use (reassoc_in_progress?), that's going
to be a problem. I find it surprising that this information is so hard to
get to... Stopping stmt_dominates_stmt_p after walking 128 statements
doesn't feel like a great solution. But without controlling the pass where
this happens, I don't see any good solution. It would be great if that
non-aliasing property could be recorded in the points-to information, then
I could set it from a pass I control, but I somehow got the impression
that it wouldn't work. Maybe I should try to understand PTA better to make
sure.

In princple PTA should know the aliasing cannot happen but possibly
the info is either lost or the IL is too obfuscated at the point it gets
computed.  (hello C++...)

We don't need much obfuscation for this, a simple C program

int g;
int*f(int**p){
  int*q=*p;
  int*r=__builtin_malloc(4);
  *q=1;
  *r=2;
  g=*q;
  return r;
}

gives

  q_4 = *p_3(D);
  r_6 = __builtin_malloc (4);
  *q_4 = 1;
  *r_6 = 2;
  _1 = *q_4;
  g = _1;
  return r_6;

where we clearly don't manage to prove that q and r don't alias.

So at ldist time I see

 # PT = null { D.47989 } (escaped, escaped heap)
 # ALIGN = 8, MISALIGN = 0
 # USE = nonlocal null { D.47989 } (escaped, escaped heap)
 # CLB = nonlocal null { D.47989 } (escaped, escaped heap)
 _27 = malloc (8192);

good.  The interesting loop is the following where the allocation PTA
is good and _4 intersect because they include ESCAPED.  This is
because _27 itself escapes and PTA not being flow-sensitive.  I don't
see an easy way to improve things here but dislike the stmt walking
very much.  That is, the proper solution is to re-write PTA in some way.

I am trying to imagine what that might look like. I imagine that instead of having a single "escaped" set, we would have multiple escaped sets at different points in the function (at most one per VDEF?). Then _4 would only contain escaped3 while heap5 (*_27) only appears in escaped9 and later? It may be tricky to keep a linear-ish complexity with anything more powerful than the current PTA. I don't know what LLVM is doing, but they seem to manage.

--
Marc Glisse

Reply via email to