[pypy-commit] extradoc extradoc: - write about high-level handling of resume data

cfbolz Fri, 20 Jul 2012 02:04:06 -0700

Author: Carl Friedrich Bolz <[email protected]>
Branch: extradoc
Changeset: r4311:3e36e47c2e01
Date: 2012-07-19 08:26 +0200
http://bitbucket.org/pypy/extradoc/changeset/3e36e47c2e01/


Log:    - write about high-level handling of resume data
        - add a start of a bibliography

diff --git a/talk/vmil2012/paper.bib b/talk/vmil2012/paper.bib
--- a/talk/vmil2012/paper.bib
+++ b/talk/vmil2012/paper.bib
@@ -0,0 +1,56 @@
+
+@inproceedings{bebenita_spur:_2010,
+       address = {{Reno/Tahoe}, Nevada, {USA}},
+       title = {{SPUR:} a trace-based {JIT} compiler for {CIL}},
+       isbn = {978-1-4503-0203-6},
+       shorttitle = {{SPUR}},
+       url = 
{http://portal.acm.org/citation.cfm?id=1869459.1869517&coll=GUIDE&dl=GUIDE&type=series&idx=SERIES318&part=series&WantType=Proceedings&title=OOPSLA%2FSPLASH&CFID=106280261&CFTOKEN=29377718},
+       doi = {10.1145/1869459.1869517},
+       abstract = {Tracing just-in-time compilers {(TJITs)} determine 
frequently executed traces (hot paths and loops) in running programs and focus 
their optimization effort by emitting optimized machine code specialized to 
these traces. Prior work has established this strategy to be especially 
beneficial for dynamic languages such as {JavaScript}, where the {TJIT} 
interfaces with the interpreter and produces machine code from the {JavaScript} 
trace.},
+       booktitle = {{OOPSLA}},
+       publisher = {{ACM}},
+       author = {Bebenita, Michael and Brandner, Florian and Fahndrich, Manuel 
and Logozzo, Francesco and Schulte, Wolfram and Tillmann, Nikolai and Venter, 
Herman},
+       year = {2010},
+       keywords = {cil, dynamic compilation, javascript, just-in-time, tracing}
+},
+
+@inproceedings{bolz_allocation_2011,
+       address = {Austin, Texas, {USA}},
+       title = {Allocation removal by partial evaluation in a tracing {JIT}},
+       abstract = {The performance of many dynamic language implementations 
suffers from high allocation rates and runtime type checks. This makes dynamic 
languages less applicable to purely algorithmic problems, despite their growing 
popularity. In this paper we present a simple compiler optimization based on 
online partial evaluation to remove object allocations and runtime type checks 
in the context of a tracing {JIT.} We evaluate the optimization using a Python 
{VM} and find that it gives good results for all our (real-life) benchmarks.},
+       booktitle = {{PEPM}},
+       author = {Bolz, Carl Friedrich and Cuni, Antonio and Fija&#322;kowski, 
Maciej and Leuschel, Michael and Pedroni, Samuele and Rigo, Armin},
+       year = {2011},
+       keywords = {code generation, experimentation, interpreters, languages, 
optimization, partial evaluation, performance, run-time environments, tracing 
jit}
+},
+
+@inproceedings{bolz_runtime_2011,
+       address = {New York, {NY}, {USA}},
+       series = {{ICOOOLPS} '11},
+       title = {Runtime feedback in a meta-tracing {JIT} for efficient dynamic 
languages},
+       isbn = {978-1-4503-0894-6},
+       url = {http://doi.acm.org/10.1145/2069172.2069181},
+       doi = {10.1145/2069172.2069181},
+       abstract = {Meta-tracing {JIT} compilers can be applied to a variety of 
different languages without explicitly encoding language semantics into the 
compiler. So far, they lacked a way to give the language implementor control 
over runtime feedback. This restricted their performance. In this paper we 
describe the mechanisms in {PyPy&#8217;s} meta-tracing {JIT} that can be used 
to control runtime feedback in language-specific ways. These mechanisms are 
flexible enough to express classical {VM} techniques such as maps and runtime 
type feedback.},
+       booktitle = {Proceedings of the 6th Workshop on Implementation, 
Compilation, Optimization of Object-Oriented Languages, Programs and Systems},
+       publisher = {{ACM}},
+       author = {Bolz, Carl Friedrich and Cuni, Antonio and Fija&#322;kowski, 
Maciej and Leuschel, Michael and Pedroni, Samuele and Rigo, Armin},
+       year = {2011},
+       keywords = {code generation, interpreter, meta-programming, runtime 
feedback, tracing jit},
+       pages = {9:1&#8211;9:8}
+},
+
+@inproceedings{bolz_tracing_2009,
+       address = {Genova, Italy},
+       title = {Tracing the meta-level: {PyPy's} tracing {JIT} compiler},
+       isbn = {978-1-60558-541-3},
+       shorttitle = {Tracing the meta-level},
+       url = {http://portal.acm.org/citation.cfm?id=1565827},
+       doi = {10.1145/1565824.1565827},
+       abstract = {We attempt to apply the technique of Tracing {JIT} 
Compilers in the context of the {PyPy} project, i.e., to programs that are 
interpreters for some dynamic languages, including Python. Tracing {JIT} 
compilers can greatly speed up programs that spend most of their time in loops 
in which they take similar code paths. However, applying an unmodified tracing 
{JIT} to a program that is itself a bytecode interpreter results in very 
limited or no speedup. In this paper we show how to guide tracing {JIT} 
compilers to greatly improve the speed of bytecode interpreters. One crucial 
point is to unroll the bytecode dispatch loop, based on two kinds of hints 
provided by the implementer of the bytecode interpreter. We evaluate our 
technique by applying it to two {PyPy} interpreters: one is a small example, 
and the other one is the full Python interpreter.},
+       booktitle = {{ICOOOLPS}},
+       publisher = {{ACM}},
+       author = {Bolz, Carl Friedrich and Cuni, Antonio and Fija&#322;kowski, 
Maciej and Rigo, Armin},
+       year = {2009},
+       pages = {18--25}
+}
\ No newline at end of file
diff --git a/talk/vmil2012/paper.tex b/talk/vmil2012/paper.tex
--- a/talk/vmil2012/paper.tex
+++ b/talk/vmil2012/paper.tex
@@ -169,20 +169,122 @@
 \section{Resume Data}
 \label{sec:Resume Data}
 
+Since tracing linearizes control flow by following one concrete execution,
+not the full control flow of a program is observed.
+The possible points of deviation from the trace are guard operations
+that check whether the same assumptions as during tracing still hold.
+In later executions of the trace the guards can fail.
+If that happens, execution needs to continue in the interpreter.
+This means it is necessary to attach enough information to a guard
+to construct the interpreter state when that guard fails.
+This information is called the \emph{resume data}.
+
+To do this reconstruction, it is necessary to take the values
+of the SSA variables of the trace
+and build interpreter stack frames.
+Tracing aggressively inlines functions.
+Therefore the reconstructed state of the interpreter
+can consist of several interpreter frames.
+
+If a guard fails often enough, a trace is started from it
+to create a trace tree.
+When that happens another use case of resume data
+is to construct the tracer state.
+
+There are several forces guiding the design of resume data handling.
+Guards are a very common operations in the traces.
+However, a large percentage of all operations
+are optimized away before code generation.
+Since there are a lot of guards
+the resume data needs to be stored in a very compact way.
+On the other hand, tracing should be as fast as possible,
+so the construction of resume data must not take too much time.
+
+\subsection{Capturing of Resume Data During Tracing}
+\label{sub:capturing}
+
+Every time a guard is recorded during tracing
+the tracer attaches preliminary resume data to it.
+The data is preliminary in that it is not particularly compact yet.
+The preliminary resume data takes the form of a stack of symbolic frames.
+The stack contains only those interpreter frames seen by the tracer.
+The frames are symbolic in that the local variables in the frames
+do not contain values.
+Instead, every local variables contains the SSA variable of the trace
+where the value would later come from, or a constant.
+
+\subsection{Compression of Resume Data}
+\label{sub:compression}
+
+The core idea of storing resume data as compactly as possible
+is to share parts of the data structure between subsequent guards.
+This is often useful because the density of guards in traces is so high,
+that quite often not much changes between them.
+Since resume data is a linked list of symbolic frames
+often only the information in the top frame changes from one guard to the next.
+The other frames can often be just reused.
+The reason for this is that during tracing only the variables
+of the currently executing frames can change.
+Therefore if two guards are generated from code in the same function
+the resume data of the rest of the stack can be reused.
+
+\subsection{Interaction With Optimization}
+\label{sub:optimization}
+
+Guards interact with optimizations in various ways.
+Most importantly optimizations try to remove as many operations
+and therefore guards as possible.
+This is done with many classical compiler optimizations.
+In particular guards can be remove by subexpression elimination.
+If the same guard is encountered a second time in the trace,
+the second one can be removed.
+This also works if a later guard is weaker implied by a earlier guard.
+
+One of the techniques in the optimizer specific to tracing for removing guards
+is guard strengthening~\cite{bebenita_spur:_2010}.
+The idea of guard strengthening is that if a later guard is stronger
+than an earlier guard it makes sense to move the stronger guard
+to the point of the earlier, weaker guard and to remove the weaker guard.
+Moving a guard to an earlier point is always valid,
+it just means that the guard fails earlier during the trace execution
+(the other direction is clearly not valid).
+
+The other important point of interaction between resume data and the optimizer
+is RPython's allocation removal optimization~\cite{bolz_allocation_2011}.
+This optimization discovers allocations in the trace that create objects
+that do not survive long.
+An example is the instance of \lstinline{Even} in the example\cfbolz{reference 
figure}.
+Allocation removal makes resume data more complex.
+Since allocations are removed from the trace it becomes necessary
+to reconstruct the objects that were not allocated so far when a guard fails.
+Therefore the resume data needs to store enough information
+to make this reconstruction possible.
+
+Adding this additional information is done as follows.
+So far, every variable in the symbolic frames
+contains a constant or an SSA variable.
+After allocation removal the variables in the symbolic frames can also contain
+``virtual'' objects.
+These are objects that were not allocated so far,
+because the optimizer removed their allocation.
+The virtual objects in the symbolic frames describe exactly
+how the heap objects that have to be allocated on guard failure look like.
+To this end, the content of the every field of the virtual object is described
+in the same way that the local variables of symbolic frames are described.
+The fields of the virtual objects can therefore be SSA variables, constants
+or other virtual objects.
+
+During the storing of resume data virtual objects are also shared
+between subsequent guards as much as possible.
+The same observation as about frames applies:
+Quite often a virtual object does not change from one guard to the next.
+Then the data structure is shared.
+
+% subsection Interaction With Optimization (end)
+
 * High level handling of resumedata
 
-- traces follow the execution path during tracing, other path not compiled at 
first
-- points of possible divergence from that path are guards
-- since path can later diverge, at the guards it must be possible to re-build 
interpreter state in the form of interpreter stack frames
-- tracing does inlining, therefore a guard must contain information to build a 
whole stack of frames
-- optimization rewrites traces, including removal of guards
-
 - frames consist of a PC and local variables
-- rebuild frame by taking local SSA variables in the trace and mapping them to 
variables in the frame
-
-two forces:
-- there are lots of guards, therefore the information must be stored in a 
compact way in the end
-- tracing must be fast
 
 compression approaches:
 - use fact that outer frames don't change in the part of the trace that is in 
the inner frame
_______________________________________________
pypy-commit mailing list
[email protected]
http://mail.python.org/mailman/listinfo/pypy-commit

[pypy-commit] extradoc extradoc: - write about high-level handling of resume data

Reply via email to