Author: Maciej Fijalkowski <fij...@gmail.com> Branch: extradoc Changeset: r4933:ee71ba86558a Date: 2012-12-07 15:54 +0200 http://bitbucket.org/pypy/extradoc/changeset/ee71ba86558a/
Log: merge diff --git a/blog/draft/py3k-status-update-8.rst b/blog/draft/py3k-status-update-8.rst new file mode 100644 --- /dev/null +++ b/blog/draft/py3k-status-update-8.rst @@ -0,0 +1,56 @@ +Py3k status update #8 +--------------------- + +This is the eight status update about our work on the `py3k branch`_, which +we can work on thanks to all of the people who donated_ to the `py3k +proposal`_. + +Just a short update on November's work: we're now passing about 194 of +approximately 355 modules of CPython's regression test suite, up from passing +160 last month. Many test modules only fail a small number of individual tests +now. + +We'd like to thank Amaury Forgeot d'Arc for his contributions, in particular he +has made significant progress on updating `CPyExt`_ for Python 3 this month. + +Some other highlights: + +* ``test_marshal`` now passes, and there's been significant progress on + pickling (thanks `Kenny Levinsen`_ and Amaury for implementing + ``int.{to,from}_bytes``) + +* We now have a ``_posixsubprocess`` module + +* More encoding related fixes, which affects many failing tests + +* ``_sre`` was updated and now ``test_re`` almost passes + +* Exception behavior is almost complete per the Python 3 specs, what's mostly + missing now are the new ``__context__`` and ``__traceback__`` attributes (`PEP + 3134`_) + +* Fixed some crashes and deadlocks occurring during the regression tests + +* We merged the `unicode-strategies`_ branch both to default and to py3k: now we + have versions of lists, dictionaries and sets specialized for unicode + elements, as we already had for strings. + +* However, for string-specialized containers are still faster in some cases + because there are shortcuts which have not been implemented for unicode yet + (e.g., constructing a set of strings from a list of strings). The plan is to + completely kill the shortcuts and improve the JIT to produce the fast + version automatically for both the string and unicode versions, to have a + more maintainable codebase without sacrificing the speed. The `autoreds`_ + branch (already merged) was a first step in this direction. + +cheers, +Philip&Antonio + +.. _donated: http://morepypy.blogspot.com/2012/01/py3k-and-numpy-first-stage-thanks-to.html +.. _`py3k proposal`: http://pypy.org/py3donate.html +.. _`py3k branch`: https://bitbucket.org/pypy/pypy/commits/all/tip/branch%28%22py3k%22%29 +.. _`autoreds`: https://bitbucket.org/pypy/pypy/commits/all/tip/branch%28%22autoreds%22%29 +.. _`unicode-strategies`: https://bitbucket.org/pypy/pypy/commits/all/tip/branch%28%22unicode-strategies%22%29 +.. _`CPyExt`: http://morepypy.blogspot.com/2010/04/using-cpython-extension-modules-with.html +.. _`Kenny Levinsen`: https://twitter.com/Joushou +.. _`PEP 3134`: http://www.python.org/dev/peps/pep-3134/ diff --git a/planning/2.0/todo.txt b/planning/2.0/todo.txt --- a/planning/2.0/todo.txt +++ b/planning/2.0/todo.txt @@ -8,6 +8,6 @@ * cffi on pypy on windows * raw malloc virtuals * bug tracker gardening - * 1292, 1090, 1294, 1282, 1289, 1282, 1286 + * 1090, 1282, 1289, 1286 * numpy: 1143, 1160, 1287 * all green buildbots diff --git a/sprintinfo/san-francisco-2012/announce.txt b/sprintinfo/san-francisco-2012/announce.txt new file mode 100644 --- /dev/null +++ b/sprintinfo/san-francisco-2012/announce.txt @@ -0,0 +1,39 @@ +PyPy San Francisco Sprint Dec 1st - Dec 2nd 2012 +================================================ + +The next PyPy sprint will be in San Francisco, California. It is a +public sprint, suitable for newcomers. It will run on Saturday December 1st and +Sunday December 2nd. The goals for the sprint are continued work towards the +2.0 release as well as code cleanup, we of course welcome any topic which +contributors are interested in working on. + +Some other possible topics are: + +* running your software on PyPy + +* work on PyPy's numpy (status__) + +* work on STM (status__) + +* JIT improvements + +* any exciting stuff you can think of + +If there are newcomers, we'll run the usual introduction to hacking on +PyPy. + +.. __: http://morepypy.blogspot.ch/2012/09/numpy-on-pypy-status-update.html +.. __: http://mail.python.org/pipermail/pypy-dev/2012-September/010513.html + + +Location +-------- + +The sprint will be held at the Rackspace Office: + +620 Folsom St, Ste 100 +San Francisco + +The doors will open at 10AM both days, and run until 6PM both days. + +Thanks to David Reid for helping get everything set up! diff --git a/sprintinfo/san-francisco-2012/planning.txt b/sprintinfo/san-francisco-2012/planning.txt new file mode 100644 --- /dev/null +++ b/sprintinfo/san-francisco-2012/planning.txt @@ -0,0 +1,4 @@ +Planning +======== + +* Implement ``os.setgroups`` diff --git a/talk/dls2006/talk-long.txt b/talk/dls2006/talk-long.txt new file mode 100644 --- /dev/null +++ b/talk/dls2006/talk-long.txt @@ -0,0 +1,353 @@ +.. include:: <s5defs.txt> + +================================================= +PyPy's VM Approach +================================================= + +:Authors: Armin Rigo, Samuele Pedroni +:Date: 23 October 2006 +:Location: DLS'06 + +PyPy +======================== + +- Python VM implementation + in Python (a well-chosen subset) +- A translation tool-chain +- Open source project (MIT license) + +VMs are still hard +======================== + +It is hard to achieve: + +- flexibility +- maintainability +- performance (needs + dynamic compilation techniques) + +Especially with limited resources. + + +Python Case +=================================== + +CPython is a straightforward, +portable VM. + +- Some decisions are pervasive: + reference counting, single global lock ... + +- No dynamic compilation. + Performance is limited. + + +- Extensions: + + * *Stackless* (heap-bound recursion, + coroutines, serializable continuations) + + * *Psyco* (run-time specializer, + interesting results) + + +Python Case (ii) +=================================== + +- Extensions... + + ... need to keep track and are hard to maintain. + Hard to port Psyco to other architectures. + +- The community wants Python to run everywhere: + Jython (Java), IronPython (.NET). + Lots of effort and duplication. + +PyPy's approach +================================= + +*Goal: generate VMs from a single +high-level description of the language, +in a retargettable way.* + +- Write an interpreter for a dynamic language (Python) + in a high-level language (Python) + +- Leave out low-level details, favour simplicity + and flexibility + +- Define a mapping to low-level targets, generating + VMs from the interpreter + +Mapping to low-level targets +=============================== + +- Mechanically translate the interpreter to multiple + lower-level targets (C-like, Java, .NET...) + +- Insert low-level aspects into the code as required by + the target (object layout, memory management...) + +- Optionally insert new pervasive features not expressed + in the source (continuations, specialization abilities...) + +Status of the project +========================== + +Fully compliant interpreter, translatable to C, +LLVM and the CLR. + +Maintainability: following the (fast-paced) +language evolution is very easy. + +Flexibility: we were able to reimplement +Stackless features without extensive +changes to the baseline interpreter + +Performance: work in-progress, +2.3 times slower than CPython +without dynamic compilation (current goal) + +... and many experiments at various levels + +Translation approach +========================== + +* Refine a subset of your favourite + language (e.g. Python) amenable + to analysis but expressive enough + to write interpreters in it. + +* Write a translation tool-chain + from this subset ("RPython") + to multiple targets (C-like, .NET, etc.) + +* The translation tool-chain should + implement (and be configurable to + be) a good mapping from the interpreter + to reasonably efficient implementations for + the various targets. + +Translation overview +========================== + +.. raw:: html + + <br> + +.. image:: image/arch2.png + :align: center + + +Type Inference +================= + +- based on abstract interpretation + +- fix-point forward propagation + +- extensible + +Targets as Type Systems +======================== + +- RPython types (lists, strings, dicts, instances and classes...) + may be too high-level for the target (e.g. in C, structs and pointers) + +- approach: reflect the essential aspects + of a target as a custom type system + into RPython (e.g. C-like types) + +:: + + STR = GcStruct('rpy_string', + ('hash', Signed), + ('chars', Array(Char))) + +Targets as Type Systems (ii) +================================ + +- implement a simulation + of the types in normal Python, + allowing code like this to run:: + + def ll_char_mul(char, length): + newstr = malloc(STR, length) + newstr.hash = 0 + for i in range(length): + newstr.chars[i] = char + return newstr + + +Targets as Type Systems (iii) +=============================== + +- extend the type inferencer + to understand usages of these types + +- use the type system + to express how regular, high-level RPython types + should be represented + at the level of the target + +- write implementation "helper" code (e.g. ``ll_char_mul``) + which is again RPython and can be type inferenced + and translated + +Translation Aspects +===================== + +*Features not present in the source can be +added during translation:* + +- memory management (Boehm, or reference counting + by transforming all control flow graphs, or our own + GCs - themselves written within the same framework as the + RPython "helper" code) + +.. GC Pressure blues + +Translation Aspects (ii) +========================== + +- continuation capture, implemented by saving the low-level + frames' local variables into the heap and back + +- work in progress: turning an interpreter into a compiler + is a translation aspect too (based on binding-time analysis + and partial evaluation, extended to use the techniques of + Psyco) + +Translation Summary +=========================== + +*The translation tool-chain +has proved effective:* + +- low-level details and + pervasive decision can be + left out of the interpreter + +- it can targets at the same time: + C, LLVM, the CLR + and is open for further backends (JVM in progress) + +- it can and has been used + in the context of other research + projects and spin-off ideas + (e.g. a JavaScript backend, + compilation of other RPython programs...) + +Website etc. +============= + +* http://codespeak.net/pypy +* IST EU co-funded project in FP6 + (7 partners) +* Thanks + +Run-time Specialization +======================== + +Previous experience: Psyco + +- a "just-in-time specializer" which can transparently + accelerate user code + +- a C hand-written "generating extension", in the terminology + of partial evaluation + +- similar to conventional JITs with the additional ability + to suspend compilation at any point, and wait for actual + run-time information (e.g. type of an object): + **promotion**. + +A Specializer as an Aspect +========================================== + +General idea (the devil is in the details): + +- Transform the flowgraphs of the interpreter + into a compiler, using the type inference + framework to do binding-time analysis (runtime/ + compile-time) based on a few hints. + +- Special hints to insert and control promotion. + +- We think that promotion is the key to making + it practical for large interpreters and complex + semantics. + +This is what we are working on right now. + +JIT Generation Diagram +======================== + +.. image:: image/arch-jit-gen.png + :align: center + +Translation Diagram +========================= + +.. image:: image/arch-translation.png + :align: center + +Self-hosted JITs +=========================== + +- they work: Jikes VM +- the language semantics need to + be captured into a good compiler +- good means the resulting VM + should be fast enough +- target hardware CPUs +- lots of effort still, and hard + to reuse for another language + +Target platform VMs (JVM, CLR) +============================== + +- semantics mismatch (e.g. + lookup) can result in speed penalty + or unnatural code + +- how to obliviously layer dynamic + compilation on top of a JIT + is effectively an open problem + +- urge to tweak the underlying VM + +- coding in Java, C#: not expressive + enough, same risks of inflexibility, + hard to revert pervasive decisions + +Open Virtual Machines +========================== + +Reconfigurable at run time to run +specific languages. + +- Open research area. + +- Large design space. + +- What are the best primitives? + +- Likely same trade-offs in + more acute form: need sharp tools. + +GC Pressure +====================== + +RPython is still a garbage collected language. + +Large allocation rate from interpreter objects +(boxes, frames) but easily temporary objects +too. + +Good allocation removal optimizations +and memory management very much needed. + +.. |bullet| unicode:: U+02022 +.. footer:: DLS'06 + diff --git a/talk/ustour2011/talk.txt b/talk/ustour2011/talk.txt new file mode 100644 --- /dev/null +++ b/talk/ustour2011/talk.txt @@ -0,0 +1,69 @@ + +* most Python benchmarks run much faster than with CPython or Psyco + + + what pypy-c is (project started in 2003, now 200KLoc + 150KLoc tests) + (2 years U.E. (~2005-2007) + 2 years Germany+Sweden (2010-running)) + + PyPy 1.4.1 supports Python 2.5; but we are almost done with support + for Python 2.7, which will be PyPy 1.5 + + boring demo (multi-line editing) + + speeeeeeeeed + + http://speed.pypy.org/ + + but underline *benchmarks* here: it's typically programs that repeatedly + do similar things for at least 10-20 seconds. + + mention also memory usage + + +* the real-world PyPy compiler toolchain itself (200 KLocs) runs twice as fast + + + "extreme" example: big program, very unfriendly to our approach of + tracing JITs + + +* already supports 64bit and is in the process of supporting ARM + + + pypy-c on 64bits + + (pypy-c on ARM -- jitted but slower so far (missing JIT+GC integration)) + + +* full compatibility with CPython (more than Jython/IronPython) +* new "cpyext" layer which integrates existing CPython C extensions + + + the main issue is that C extension modules don't all work out of the box + + but some do (slowly (which may matter or not)) + + the core supports "the full language", which is CPython minus some + small number of issues; the most visible ones are related to refcounts + (ends up closer than Jython/IronPython) + + +* full (and JIT-ed) ctypes support to call C libraries from Python +* supports Stackless Python (in-progress) +* an experimental super-fast JIT-compilation of calls to C++ libraries + + + this is all experimental + + +* architecture + + + interpreter written in Python (actually RPython, a subset) + + gets "translated" to C code + + various "aspects" are added during translation to C, like + the GC and the JIT + + it's a tracing JIT (expand...?) diff --git a/talk/vmil2012/jit-guards_submitted.pdf b/talk/vmil2012/jit-guards_submitted.pdf deleted file mode 100644 Binary file talk/vmil2012/jit-guards_submitted.pdf has changed diff --git a/talk/vmil2012/paper.tex b/talk/vmil2012/paper.tex --- a/talk/vmil2012/paper.tex +++ b/talk/vmil2012/paper.tex @@ -120,18 +120,10 @@ \conferenceinfo{VMIL'12,} {October 21, 2012, Tucson, Arizona, USA.} \CopyrightYear{2012} \copyrightdata{978-1-4503-1633-0/12/10} -\crdata{} +%\crdata{} \maketitle -\category{D.3.4}{Programming Languages}{Processors}[code generation, -incremental compilers, interpreters, run-time environments] - -\terms -Languages, Performance, Experimentation - -\keywords{tracing JIT, guards, deoptimization} - \begin{abstract} Tracing just-in-time (JIT) compilers record linear control flow paths, inserting operations called guards at points of possible divergence. These @@ -144,6 +136,11 @@ % \o/ \end{abstract} +\category{D.3.4}{Programming Languages}{Processors}[code generation, +incremental compilers, interpreters, run-time environments] +\terms +Languages, Performance, Experimentation +\keywords{tracing JIT, guards, deoptimization} %___________________________________________________________________________ \section{Introduction} @@ -512,7 +509,7 @@ \label{sec:Guards in the Backend} \begin{figure}[ht] -\includegraphics[width=0.4\textwidth]{figures/resume_data} +\includegraphics[width=0.45\textwidth]{figures/resume_data} \vspace{-3mm} \caption{The resume data for Figure~\ref{fig:trace-log}} \label{fig:resume-data} diff --git a/talk/vmil2012/vmil01-schneider.pdf b/talk/vmil2012/vmil01-schneider.pdf new file mode 100644 index 0000000000000000000000000000000000000000..f6355831e03ac23489f8f1a7419029d7398cdd3c GIT binary patch [cut] _______________________________________________ pypy-commit mailing list pypy-commit@python.org http://mail.python.org/mailman/listinfo/pypy-commit