Hi,
On Sat, Apr 27, 2013 at 5:19 PM, Jan de Mooij <[email protected]> wrote: > Hi, > > This looks really interesting. Not emitting code for cold branches could > not only help LICM and GVN but also register allocation by avoiding > unnecessary spills and moves. > > However, my main concerns are: > > (1) The circular buffer to collect branch information seems too > complicated. It would be simpler and faster to have the baseline compiler > just increment a counter at the start of every basic block (we can easily > get this information from the bytecode analysis). I think counters also > allow you to keep more useful data (one hot loop will quickly fill the > circular buffer and you lose information about other branches). > To avoid the "hot loop" problem we can allocate a counter for each address in the buffer and increment the counter if this address have been pushed into the buffer. We can just compare the new address with two or three newly pushed addresses so the overhead would not be high. But this design is more complicated, unfortunately. Circular buffer can provide some sorts of path profiling data, which might be useful in some optimizations. On the other hand, I agree that the basic block counters are easier to implement and able to provide enough information for branch prediction. > > (2) Instead of pruning MIR from the graph, IonBuilder could just > immediately terminate "cold" basic blocks (very similar to what it does > when it reaches a JSOP_RETURN or JSOP_THROW). Just add a new MIR > instruction that unconditionally bails out. > This approach is fresh to me and seems easier than pruning. I think they share some common tasks when a basic block is marked "cold": remove the instructions in the basic block, update use-def chains, reduce type sets and phi nodes. Pruning MIR from the graph have more work to do and might gain extra optimization opportunities. I would like to implement the "terminate" approach first and see whether it works or not. Then we can implement the "pruning" approach quickly, based on the "terminate" approach, if necessary. > > (3) Before implementing anything, it would be good to look at some > benchmarks (Kraken, Octane, maybe Emscripten code) and see where we expect > the main perf wins. Then you can hack/prototype something just for that > particular case and see what happens. > Currently I'm searching Kraken to find some functions for case study but the progress is quite slow. I could not find such profiling tools that can show the frequencies of each branch so I literally read the JavaScript code to make guesses. I use the Gecko profiler, JIT Inspector and IonGraph to accelerate my work. Could you recommend me some utilities? That would be great help. > > Jan > > > > > On Sat, Apr 27, 2013 at 10:04 AM, Wei WU(吴伟) <[email protected]> wrote: > >> Hi, >> >> I have finished the initial version of my proposal and submitted to GSoC >> website. I put a copy here in case you don't have permission to comment on >> it. You can also review my proposal via this link: >> https://github.com/lazyparser/gsoc2013/blob/master/proposal.md >> >> Your comments and suggestions are welcome! >> * >> * >> *Title* >> >> Implement Profile Guided Optimization in IonMonkey >> >> *Abstract* >> >> The aim of the project is to implement a branch profiling module and >> improve IonMonkey optimizations by leveraging the information gathered >> from >> the profiling module. >> >> *Project Proposal* >> >> I propose to implement a branch profiling module and a MIR pass for >> pruning >> infrequently used branches in IonMonkey. >> >> *Principles* >> >> Profile Guided Optimizations have been adopted by modern compilers, such >> as >> LLVM and many JIT compilers in Java Virtual Machines. Lots of >> optimizations >> can benefit from the profiling data to generate faster codes. Currently >> IonMonkey lacks the mechanism to gather branch profiling information and >> has to use heuristics to guess the probability of each conditional jump. >> These heuristics are usually conservative and may block further >> (aggressive) optimization opportunities. >> >> By introducing the branch profiling data IonMonkey can aggressively remove >> the infrequently used basic blocks and therefore simplify subsequent >> analyses like type inference and alias analysis. Nicolas Pierron, an >> IonMonkey developer, suggested that other optimizations such as GVN & LICM >> might benefit from the profiling information even if branches were not >> removed. >> >> *Project scope* >> >> Ideally, all analyses and optimizations existed in IonMonkey might be >> benefit from using branch profiling data and should be explored. The >> profiling module should be able to gather multiple types of running >> information other than conditional jumps only. However, this project will >> implement the branch profiling functionality in baseline compiler and a >> MIR >> transformation pass, due to the time limitation of a GSoC project. >> >> *Technical details* >> >> This project can be divided into two parts, i.e. the profiling module and >> the MIR transformation pass. I will describe the details of them in this >> section. >> >> *Branch Profiling Module* >> >> Branch Profiling Module (BPM) collects the targets of conditional jumps by >> instrumenting SpiderMonkey engine. There are three potential >> instrumentation points in SpiderMonkey. Instrumenting interpreter is the >> easiest way to get things work but the the data it collects is fragmented >> and less important, since(for) "hot" JavaScript methods are not executed >> by >> interpreter. IonMonkey is another place that can insert profiling code but >> the overhead caused by profiling makes it unfeasible. BPM leverages the >> Baseline Compiler to generate instrumented machine codes. BPM rewrites the >> output of baseline compiler when it generating code for JSOP_IFEQ, >> JSOP_IFNE, and JSOP_TABLESWITCH bytecodes. >> >> BPM uses a circular buffer to store data and allocates one buffer for each >> IonScript object. Every time BPM is triggered it pushes both the address >> of >> the conditional jump instruction and the target address it jumps to. Main >> benefits of this design are that not only frequencies of each branch can >> be >> calculated but also the relationships between jumps are able to be >> inferred >> by advanced algorithms. >> >> BPM can be switched on/off through command line options or browser's >> configuration panel. Also it can be turned on/off via JSAPIs. The design >> of >> the interface would be like this: >> >> JS_IsBranchProfilingEnabled(JSContext* cx); >> JS_EnableBranchProfiling(JSContext* cx); >> JS_DisableBranchProfiling(JSContext* cx); >> >> Although BPM can only be switched on/off at JSContext granularity outside >> SpiderMonkey, it stores a profiling status flag in IonScript so that >> SpiderMonkey has the ability to profile a JSScript (IonScript) >> individually. >> >> *MIR Branch Pruning Pass* >> >> The Branch Pruning Pass (BPP) is invoked right after the MIR is generated >> by IonBuilder. Based on the probabilities converted from the branch >> profiling buffer BPP speculatively removes unused or infrequently used >> basic blocks. After the removal of "cold branches", BPP is able to reduce >> the type sets and simplify the work of alias analysis. >> >> The main drawback of pruning branches is that when the pruned branch is >> taken a bailout will occur. The more basic blocks are removed the more >> bailouts may occur. One goal of this project is to find out the balance >> point between the aggressiveness of pruning and the costs of bailout. >> >> *Schedule of Deliverables**Deliverables* >> >> The final deliverables would consist of: >> 1. A patch that implements Circular Buffer utility and related test cases. >> 2. A patch that implements Branch Profiling Module and related test cases. >> 3. A patch that implements MIR Branch Pruning Pass and related test cases. >> 4. A technical report illustrating the relationships among the >> aggressiveness of pruning, type inference, alias analysis and the costs of >> bailout. >> >> *Schedule* >> >> Week 1 - Week 2: Discuss the project details with my mentor. Get familiar >> with the JS Engine team. >> Week 3 - Week 4: Implement the Circular Buffer. Test and fix bugs. >> Week 5 - Week 6: Implement the Branch Profiling prototype; instrument the >> machine codes generated by baseline compiler. >> Week 7 - Week 8: Add the convert function to Branch Profiling class. >> Modify >> related MIR nodes to store the probability data. >> Week 9 - Week 11: Implement the prototype of Branch Pruning Pass. Discuss >> the implementation details with my mentor. >> Week 12 - Week 14: Benchmark the performance impact of BPM and BPP with >> different pruning policies. Test and fix the bugs in BPP. >> Week 15 - Week 16: Preparing the technical report. Submit patches. >> >> *Open Source Development Experience* >> TBD. >> >> *Work/Internship Experience* >> TBD. >> >> *Academic Experience* >> TBD. >> >> *Why Me* >> TBD. >> >> *Why Mozilla* >> TBD. >> -- >> Best wishes, >> Wei Wu (吴伟) >> _______________________________________________ >> dev-tech-js-engine-internals mailing list >> [email protected] >> https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals >> > > -- Best wishes, Wei Wu (吴伟) _______________________________________________ dev-tech-js-engine-internals mailing list [email protected] https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals

