Hi,

On 04/22/2013 10:05 PM, Wei WU(吴伟) wrote:
Hi,


On Sun, Apr 21, 2013 at 8:37 AM, Nicolas B. Pierron <
[email protected]> wrote:

On 04/20/2013 02:41 AM, 吴伟/Wei WU wrote:

On Fri, Apr 19, 2013 at 10:01 AM, Nicolas B. Pierron <
[email protected]> wrote:


  - Adding profile guided optimization, the idea would be to profile which
branches are used and to prune branches which are unused, either while
generating the MIR Graph, or as a second optimization phase working on
the
graph.  A trivial way of doing so can just look at the jump target which
have been visited so far.  A more clever one might register the
transition
and use Markov chain to determine relations between branches, and
duplicate
a sub-part of the graph to improve constant propagation and the range
analysis.  Before doing anything fancy like the more-clever case we still
need to check that this is worth it and see the benefit of other
optimizations if we fold the graph.

  I'm interested in this idea and I'm willing to implement it. A simple
and
fast algorithm may be a good choice for GSoC, while the advanced methods
require more investigation. I haven't found any branch profiling code in
the interpreter so the instrumenting mechanism must be finished first, or
we can leverage baseline compiler to generate self-instrumented jit code.
Profiling data can be saved separately or in MIR nodes as an annotation.
In
the second case more other passes may benefit from it easily.


Indeed, there is no code for profiling yet.  The most easiest way to think
about it is to do like the write barrier of the incremental GC, i-e having
a buffer (potentially circular, as the opposite of the write barrier) which
is filled with jump targets.  Thus, every time we jump to another location
in the code we push the location of the offset in this buffer, and later we
reuse this buffer to re-organize basic blocks in IonMonkey and to duplicate
/ prune basic blocks if needed.


There are two possible ways to store branch profiling data. One is count
array, the other is target buffer.
I've read LLVM's implementation of Branch(Edge) Profiling and found that it
uses a vector (array) to save the information. LLVM allocates two counters
for each branch/edge. When a block jumps to another the correlate counters
will be incremented by 1. Then these frequencies will be transformed into
probabilities and consumed by some transform passes. Jikes RVM stores
branch profiling data in the similar way.
One possible problem is that they use block ID to index branch counters and
we don't have such things in JavaScript bytecodes.

Indeed, but we can use the jsbytecode pointers, as targets, and later convert them to basic block numbers when we process the branch profile.

On the other hand, branch target buffer maintains a sequence of branch
targets and the index problem can be avoided. Further more, It makes
possible to determine relations between branches. The main problem I have
considered is that the cost of calculating branch probabilities might be
high, since we must traverse the buffer to summarize the results.

I don't think the cost would be high, current CPU are good at prefetching memory ahead of time on linear read of a buffer, which is likely the case that we will encounter. What you need then would be a way to map the target to the basic block to recover the counters.

It might be good, as a first step to have a circular buffer to experiment with the branch prediction, and later reduce it to simple counters if the buffer proved to be a major pitfall in terms of performance, or if the we do not have any benchmark which can take advantage of a more-clever way of inferring branches ordering.

I have encountered some other issues. Could you give me some suggestions?
- Instrumenting the Interpreter is fairly easy, while I'm not sure it is
possible to instrument baseline compiler in the similar way. I think it's
better to support them both.

As opposed to TI, which need to keep a consistent model of the types flowing in, there is no restriction at only taking a sub-set of the profiling data.

Profiling only in the Baseline's code should be enough and would avoid adding extra cost to the interpreter which is used by all script which are running a few times.

- Overhead. Instrumenting every conditional jumps may degrades performance
and an 'accepted overhead rate' should be considered. Also, it should be
able to switch on/off by command line arguments.

I agree with the CLI switch. The "accepted overhead" is weel-defined as being our benchmark score and the memory usage. If you can improve our benchmark score without taking too much memory, then this would be acceptable performances.

- Store/Restore profiling data on disks. Branch profiling data can be
stored and restored in LLVM and Jikes RVM. But I don't think it is
necessary in SpiderMonkey.

Indeed, I don't think it would be necessary either, and it would be even more painful as the script might change when reloading the page, and matching it back to the new script my lead to unexpected behaviour.

- Designing the interface of Branch Profiling. Since I was unable to find
design guidelines for SpiderMonkey, I will propose a draft interface in my
proposal and update it in future.

The first goal would be to determine what might benefit from the removal of unused (infrequently used) basic blocks. And see how this impact performances.

It might also be interesting to test different configuration such as doing more bailouts and OSR based on what is recorded in the profile. such as a branch which is not frequently taken and which return to the loop-head might be worth discarding if on the other side we can improve the other optimizations such as alias analysis, constant propagation and the range analysis.

Even if branches are not removed, this might hint other optimizations such as GVN & LICM to prevent moving some operations to locations where they might induce more cost.

I am planning to submit a proposal for this project. Would you like to be
my mentor during this summer?

If your project is accepted by the GSoC program and by Mozilla, then I can mentor this project.

--
Nicolas B. Pierron

_______________________________________________
dev-tech-js-engine-internals mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals

Reply via email to