Author: Maciej Fijalkowski <[email protected]>
Branch: extradoc
Changeset: r5547:a979df8a839e
Date: 2015-09-08 17:37 +0200
http://bitbucket.org/pypy/extradoc/changeset/a979df8a839e/

Log:    draft a blog post

diff --git a/blog/draft/warmup-improvements.rst 
b/blog/draft/warmup-improvements.rst
new file mode 100644
--- /dev/null
+++ b/blog/draft/warmup-improvements.rst
@@ -0,0 +1,51 @@
+Hello everyone!
+
+I'm very pleased to announce that we've just managed to merge optresult branch.
+Under this cryptic name is the biggest JIT refactoring we've done in a couple
+years, mostly focused on the warmup time and memory impact of PyPy.
+
+To understand why we did that, let's look back in time - back when we
+got the first working JIT prototype in 2009 we were focused exclusively
+on the peak performance with some consideration towards memory usage, but
+without serious consideration towards warmup time. This means we accumulated
+quite a bit of technical debt over time that we're trying, with difficulty,
+to address right now.
+
+The branch does "one" thing - it changes the underlaying model of how 
operations
+are represented during the tracing and optimizations. Let's consider a simple
+loop like that::
+
+    [i0, i1]
+    i2 = int_add(i0, i1)
+    i3 = int_add(i2, 1)
+    i4 = int_is_true(i3)
+    guard_true(i4)
+    jump(i3, i2)
+
+The original representation would allocate a ``Box`` for each of ``i0`` - 
``i4``
+and then store those boxes in instances of ``ResOperation``. The list of such
+operations would then go to the optimizer. Those lists are big - we usually
+remove ``90%`` of them during optimizations, but they can be couple thousand
+elements. Overall allocating those big lists takes a toll on warmup time,
+especially due to the GC pressure. The branch removes the existance of ``Box``
+completely, instead using link to ``ResOperation`` itself. So say in the above
+example, ``i2`` would refer to its producer - ``i2 = int_add(i0, i1)`` with
+arguments getting special treatment.
+
+That alone reduces the GC pressure slightly, but we went an extra mile
+to change a bunch of data structures in the optimizer itself. Overall
+we measured about 50% speed improvement in the optimizer, which reduces
+the overall warmup time between 10% and 30%. The very
+`obvious warmup benchmark`_ got a speedup from 4.5s to 3.5s so almost
+30% improvement. Obviously the speedups on benchmarks would vastly
+depend on how much warmup time is there in those benchmarks. We observed
+annotation of pypy to decrease by about 30% and the overall translation
+time by about 7%, so your mileage may vary. In fact in most cases there
+should not be a visible difference if you're already achieving peak 
performance,
+however wherever warmup is a problem there should be a modest speedup.
+
+.. _`obvious warmup benchmark`: 
https://bitbucket.org/pypy/benchmarks/src/fe2e89c0ae6846e3a8d4142106a4857e95f17da7/warmup/function_call2.py?at=default
+
+Cheers!
+fijal & arigo
+
diff --git a/talk/ep2015/performance/Makefile b/talk/ep2015/performance/Makefile
--- a/talk/ep2015/performance/Makefile
+++ b/talk/ep2015/performance/Makefile
@@ -5,7 +5,7 @@
 # 
https://sourceforge.net/tracker/?func=detail&atid=422032&aid=1459707&group_id=38414
 
 talk.pdf: talk.rst author.latex stylesheet.latex
-       python ../../bin/rst2beamer.py --stylesheet=stylesheet.latex 
--documentoptions=14pt talk.rst talk.latex || exit
+       python ../../bin/rst2beamer.py --stylesheet=stylesheet.latex 
--documentoptions=14pt --input-encoding=utf8 --output-encoding=utf8 talk.rst 
talk.latex || exit
        #/home/antocuni/.virtualenvs/rst2beamer/bin/python `which 
rst2beamer.py` --stylesheet=stylesheet.latex --documentoptions=14pt talk.rst 
talk.latex || exit
        sed 's/\\date{}/\\input{author.latex}/' -i talk.latex || exit
        #sed 's/\\maketitle/\\input{title.latex}/' -i talk.latex || exit
diff --git a/talk/ep2015/performance/author.latex 
b/talk/ep2015/performance/author.latex
--- a/talk/ep2015/performance/author.latex
+++ b/talk/ep2015/performance/author.latex
@@ -2,7 +2,7 @@
 
 \title[Python and PyPy performance]{Python and PyPy performance\\(not) for 
dummies}
 \author[antocuni,fijal]
-{Antonio Cuni and Maciej Fijalkowski}
+{Antonio Cuni and Maciej Fija&#322;kowski}
 
 \institute{EuroPython 2015}
 \date{July 21, 2015}
diff --git a/talk/ep2015/performance/talk.pdf b/talk/ep2015/performance/talk.pdf
index 
874cebbd96fb24e3f93148dfd82afb0985fc7145..b17e14c7d49dc0a2c47900b50419cb90e1a31aae
GIT binary patch

[cut]

diff --git a/talk/ep2015/performance/talk.rst b/talk/ep2015/performance/talk.rst
--- a/talk/ep2015/performance/talk.rst
+++ b/talk/ep2015/performance/talk.rst
@@ -16,16 +16,6 @@
 - http://baroquesoftware.com/
 
 
-About you
--------------
-
-- You are proficient in Python
-
-- Your Python program is slow
-
-- You want to make it fast(er)
-
-
 Optimization for dummies
 -------------------------
 
@@ -55,43 +45,37 @@
 
   2. How to address the problems
 
+Part 1
+------
 
-Part 1
--------
-
-* profiling
-
-* tools
-
+* identifying the slow spots
 
 What is performance?
 --------------------
 
-* you need something quantifiable by numbers
+* something quantifiable by numbers
 
 * usually, time spent doing task X
 
-* sometimes number of requests, latency, etc.
+* number of requests, latency, etc.
 
-* some statistical properties about that metric (average, minimum, maximum)
+* statistical properties about that metric
 
 Do you have a performance problem?
 ----------------------------------
 
-* define what you're trying to measure
+* what you're trying to measure
 
-* measure it (production, benchmarks, etc.)
+* means to measure it (production, benchmarks, etc.)
 
-* see if Python is the cause here (if it's not, we can't help you,
-  but I'm sure someone can)
+* is Python is the cause here?
 
-* make sure you can change and test stuff quickly (e.g. benchmarks are better
-  than changing stuff in production)
+* environment to quickly measure and check the results
 
-* same as for debugging
+  - same as for debugging
 
-We have a python problem
-------------------------
+When Python is the problem
+--------------------------
 
 * tools, timers etc.
 
@@ -106,7 +90,7 @@
 
 * cProfile, runSnakeRun (high overhead) - event based profiler
 
-* plop, vmprof - statistical profiler
+* plop, vmprof - statistical profilers
 
 * cProfile & vmprof work on pypy
 
@@ -121,8 +105,8 @@
 
 * CPython, PyPy, possibly more virtual machines
 
-why not just use gperftools?
-----------------------------
+why not  gperftools?
+--------------------
 
 * C stack does not contain python-level frames
 
@@ -349,10 +333,19 @@
 
 * avoid creating classes at runtime
 
+Example
+-------
+
+* ``map(operator.attrgetter('x'), list)``
+
+vs
+
+* ``[x.x for x in list]``
+
 More about PyPy
 ---------------
 
-* we are going to run a PyPy open space
+* we are going to run a PyPy open space (tomorrow 18:00 @ A4)
 
 * come ask more questions
 
_______________________________________________
pypy-commit mailing list
[email protected]
https://mail.python.org/mailman/listinfo/pypy-commit

Reply via email to