Hi,
I would recommend that this optimization be restricted to DefaultRectangular
arrays, at least to start with. They are the simplest arrays to reason about
and we can more easily make assumptions and guarantees about them. Mainly,
DefaultRectangular arrays cannot alias another array's elements.
I would also recommend that this be an optimization pass after the
loopInvariantCodeMotion pass. This will probably be the easiest time to perform
analysis since everything has been inlined and most expressions won't be
drastically
altered further.
I'll also recommend that this optimization only fire under "--local"
compilation,
at least to start. Dealing with local variables will be easier than remote
variables
at the beginning.
To determine if two arrays alias each other, I would recommend you start by
following symbols with the "_array" record type as they are move'd, addrof'd,
and passed to functions. If an _array is acquired from a PRIM_GET_MEMBER*,
a PRIM_ARRAY_GET*, or is the result of a function call, I would recommend you
start by considering that array to be an alias. You will probably need to
make an exception for the ' chpl__convertRuntimeTypeToValue' function that
creates the array. I'm not really familiar with traditional alias analysis,
but I suspect you'll want to build up sets of aliases.
At this point I'll pause and ask a question:
Let's say we have a chapel statement like this:
A[i] = A[j] + B[k];
It will boil down to AST like this:
```
var class0 = A._instance;
var data0 = class0.shiftedData;
ref element0 = data0[i];
var class1 = A._instance;
var data1 = class1.shiftedData;
ref element1 = data1[j];
var classB = B._instance;
var dataB = classB.shiftedData;
ref elementB = dataB[k];
element0 = element1 + elementB;
```
Here "data0[i]" is just like indexing into a C pointer. I believe this would
translate to a GEP in LLVM, meaning element0 would also be a pointer.
Do we need to perform the analysis to determine that i != j? Is there a way
we can offload that analysis to LLVM? I could see how we could apply metadata
to tell LLVM that data0/1 does not alias dataB, but the "i != j" analysis seems
much more difficult.
Put another way: is it enough to add metadata to 'dataB', or do we need to add
aliasing metadata to 'elementB' as well?
-Ben Harshbarger
From: Przemek Leśniak <[email protected]>
Date: Tuesday, August 22, 2017 at 3:15 PM
To: "[email protected]"
<[email protected]>
Subject: [Chapel-developers] Improving aliasing analysis in LLVM IR for Chapel
Hello Chapel developers,
With recent work on tuplesVSarrays performance issue I noticed that part of
performance degradation was related to aliasing. I have discovered quite
trivial case where adding noalias and alias.scope metadata would greatly
improve performance, due to LLVM being unable to notice that specific array
doesn't alias:
Here is a link to the full file:
https://github.com/coodie/chapel/blob/2d1e0c52a7727bda863afb9e43f6261d321d1aa2/test/llvm/noalias-experiment/experiment.chpl
Here is most relevant part of the code:
for i in Dom do
for k in 1..subOrder do
Array[i] = subArray[k];
Array and subArray are trivially known not to alias (because both of these
symbols aren't references so they cannot alias). In this case this is
equivalent to:
const x = subArray[subOrder];
for i in Dom do
Array[i] = x;
But LLVM is unable to see that. What I did is adding specific metadata that
says that loads and stores in `Array[i] = subArray[k];` assignment don't alias,
and in this particular case it gave performance improvement - the code run 3x
faster with subOrder of 30, but gave improvement when subOrder was smaller
which is not surprising as there was less stores to eliminate. I believe there
are more complex (and useful) cases where extra information about aliasing
would improve performance due to some optimizations occuring and this is what
I'm currently trying to find which is quite difficult, because adding metadata
the way I did for this experiment is incredibly tedious and difficult when
there are more arrays.
My idea for adding this metadata would be for expressions that contain array
symbol that are known not to alias (because, say they are all marked as 'var').
That's probably not the most precise explanation, but what I want to say is
that I just want to cover simple and obvious cases. Here is clearer example of
place where I find adding this metadata is plausible:
var A : [1..a];
var B : [1..b];
var C : [1..c];
//... possibly some computation here
for i in 1..a do
for j in 1..b do
for k in 1..c do
A[i] = A[i]+B[j]*C[k];
In the innermost loop expression we can mark references to A[i], B[j] and C[k]
to be not aliasing.
The way it is done in LLVM is to create metadata for domain, add metadata for
every 'scope', create lists of scopes and then mark loads and stores using
alias.scope and noalias metadata. When operation is marked using noalias with
reference to list of scopes, this means that this load doesn't alias with with
scopes in this list. When operation is marked with alias.scope it means that
this operation refers to certain scopes.
I think it's best to actually read the documentation as it's described pretty
well in there:
http://llvm.org/docs/LangRef.html#noalias-and-alias-scope-metadata
Basic idea is fairly simple: for every distinct array symbol in expression
create new scope, and list of scopes not containing this array's scope but al
remaining ones in the expression. Then mark all loads and stores to this array
using noalias with list of remaining arrays and alias.scope with scope
referring only to this array.
So in general, when we generate store and load we need to know at this point:
1) Expression we are in
2) Array symbol we are referring to
3) Whether expression is good candidate to add the metadata
Current chapel compiler implementation doesn't make it very easy to access any
of that information in functions generating stores and loads (codegenStoreLLVM,
codegenLoadLLVM) and before compiler gets to the codegen, there is a lot of
things going on. Since I'm excited about this It'd be great if some developer
helped me with that (someone who knows AST and LLVM relatively well).
Anyway, I'd like to receive some feedback on general idea: which cases we could
cover (they don't necessarily need to be related to arrays), would it be
easy/hard to implement, where it could possibly improve performance.
Cheers, Przemek
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Chapel-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/chapel-developers