Hi,

I would recommend that this optimization be restricted to DefaultRectangular
arrays, at least to start with. They are the simplest arrays to reason about
and we can more easily make assumptions and guarantees about them. Mainly,
DefaultRectangular arrays cannot alias another array's elements.

I would also recommend that this be an optimization pass after the
loopInvariantCodeMotion pass. This will probably be the easiest time to perform
analysis since everything has been inlined and most expressions won't be 
drastically
altered further.

I'll also recommend that this optimization only fire under "--local" 
compilation,
at least to start. Dealing with local variables will be easier than remote 
variables
at the beginning.

To determine if two arrays alias each other, I would recommend you start by
following symbols with the "_array" record type as they are move'd, addrof'd,
and passed to functions. If an _array is acquired from a PRIM_GET_MEMBER*,
a PRIM_ARRAY_GET*, or is the result of a function call, I would recommend you
start by considering that array to be an alias. You will probably need to
make an exception for the ' chpl__convertRuntimeTypeToValue' function that
creates the array. I'm not really familiar with traditional alias analysis,
but I suspect you'll want to build up sets of aliases.

At this point I'll pause and ask a question:

Let's say we have a chapel statement like this:
A[i] = A[j] + B[k];

It will boil down to AST like this:

```
var class0 = A._instance;
var data0 = class0.shiftedData;
ref element0 = data0[i];

var class1 = A._instance;
var data1 = class1.shiftedData;
ref element1 = data1[j];

var classB = B._instance;
var dataB = classB.shiftedData;
ref elementB = dataB[k];

element0 = element1 + elementB;
```

Here "data0[i]" is just like indexing into a C pointer. I believe this would
translate to a GEP in LLVM, meaning element0 would also be a pointer.

Do we need to perform the analysis to determine that i != j? Is there a way
we can offload that analysis to LLVM? I could see how we could apply metadata
to tell LLVM that data0/1 does not alias dataB, but the "i != j" analysis seems
much more difficult.

Put another way: is it enough to add metadata to 'dataB', or do we need to add
aliasing metadata to 'elementB' as well?

-Ben Harshbarger


From: Przemek Leśniak <[email protected]>
Date: Tuesday, August 22, 2017 at 3:15 PM
To: "[email protected]" 
<[email protected]>
Subject: [Chapel-developers] Improving aliasing analysis in LLVM IR for Chapel

Hello Chapel developers,
With recent work on tuplesVSarrays performance issue I noticed that part of 
performance degradation was related to aliasing. I have discovered quite 
trivial case where adding noalias and alias.scope metadata would greatly 
improve performance, due to LLVM being unable to notice that specific array 
doesn't alias:

Here is a link to the full file:

https://github.com/coodie/chapel/blob/2d1e0c52a7727bda863afb9e43f6261d321d1aa2/test/llvm/noalias-experiment/experiment.chpl

Here is most relevant part of the code:

  for i in Dom do
    for k in 1..subOrder do
      Array[i] = subArray[k];
Array and subArray are trivially known not to alias (because both of these 
symbols aren't references so they cannot alias). In this case this is 
equivalent to:
  const x = subArray[subOrder];
  for i in Dom do
      Array[i] = x;

But LLVM is unable to see that. What I did is adding specific metadata that 
says that loads and stores in `Array[i] = subArray[k];` assignment don't alias, 
and in this particular case it gave performance improvement - the code run 3x 
faster with subOrder of 30, but gave improvement when subOrder was smaller 
which is not surprising as there was less stores to eliminate. I believe there 
are more complex (and useful) cases where extra information about aliasing 
would improve performance due to some optimizations occuring and this is what 
I'm currently trying to find which is quite difficult, because adding metadata 
the way I did for this experiment is incredibly tedious and difficult when 
there are more arrays.

My idea for adding this metadata would be for expressions that contain array 
symbol that are known not to alias (because, say they are all marked as 'var'). 
That's probably not the most precise explanation, but what I want to say is 
that I just want to cover simple and obvious cases. Here is clearer example of 
place where I find adding this metadata is plausible:

var A : [1..a];
var B : [1..b];
var C : [1..c];

//... possibly some computation here

for i in 1..a do
  for j in 1..b do
    for k in 1..c do
      A[i] = A[i]+B[j]*C[k];

In the innermost loop expression we can mark references to A[i], B[j] and C[k] 
to be not aliasing.

The way it is done in LLVM is to create metadata for domain, add metadata for 
every 'scope', create lists of scopes and then mark loads and stores using 
alias.scope and noalias metadata. When operation is marked using noalias with 
reference to list of scopes, this means that this load doesn't alias with with 
scopes in this list. When operation is marked with alias.scope it means that 
this operation refers to certain scopes.

I think it's best to actually read the documentation as it's described pretty 
well in there:

http://llvm.org/docs/LangRef.html#noalias-and-alias-scope-metadata

Basic idea is fairly simple: for every distinct array symbol in expression 
create new scope, and list of scopes not containing this array's scope but al 
remaining ones in the expression. Then mark all loads and stores to this array 
using noalias with list of remaining arrays and alias.scope with scope 
referring only to this array.

So in general, when we generate store and load we need to know at this point:
1) Expression we are in
2) Array symbol we are referring to
3) Whether expression is good candidate to add the metadata

Current chapel compiler implementation doesn't make it very easy to access any 
of that information in functions generating stores and loads (codegenStoreLLVM, 
codegenLoadLLVM) and before compiler gets to the codegen, there is a lot of 
things going on. Since I'm excited about this It'd be great if some developer 
helped me with that (someone who knows AST and LLVM relatively well).

Anyway, I'd like to receive some feedback on general idea: which cases we could 
cover (they don't necessarily need to be related to arrays), would it be 
easy/hard to implement, where it could possibly improve performance.

Cheers, Przemek
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Chapel-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/chapel-developers

Reply via email to