Author: Matti Picus <[email protected]> Branch: stm-edit Changeset: r5187:012112491eee Date: 2014-04-06 22:54 +0300 http://bitbucket.org/pypy/extradoc/changeset/012112491eee/
Log: suggest edits to donatetm2 diff --git a/planning/tmdonate2.txt b/planning/tmdonate2.txt --- a/planning/tmdonate2.txt +++ b/planning/tmdonate2.txt @@ -49,36 +49,36 @@ they can use the existing ``threading`` module, with its associated GIL and the complexities of real multi-threaded programming (locks, deadlocks, races, etc.), which make this solution less attractive. The -big alternative is for them to rely on one of various multi-process -solutions that are outside the scope of the core language. All of them require a -big restructuring of the program and often need extreme care and extra +most attractive alternative for most developers is to rely on one of various multi-process +solutions that are outside the scope of the core Python language. All of them require a +major restructuring of the program and often need extreme care and extra knowledge to use them. -The aim of this series of proposals is to research and implement +We propose implemention of Transactional Memory in PyPy. This is a technique that recently came to the forefront of the multi-core scene. It promises to offer multi-core CPU -usage without requiring to fall back to the multi-process solutions -described above, and also should allow to change the core of the event systems +usage without the explicit multiprocessing or event techniques above, +and also should allow modifying the core of the event systems mentioned above to enable the use of multiple cores without the explicit use of the ``threading`` module by the user. The first proposal was launched near the start of 2012 and has covered -the fundamental research part, up to the point of getting a first +much of the fundamental research, up to the point of getting a first version of PyPy working in a very roughly reasonable state (after collecting about USD$27'000, which is little more than half of the money -that was asked; hence the present second call for donations). +that was sought; hence the present second call for donations). -This second proposal aims at fixing the remaining issues until we get a -really good GIL-free PyPy (described in `goal 1`_ below); and then we -will focus on the various new features needed to actually use multiple +We now propose fixing the remaining issues to obtaining a +really good GIL-free PyPy (described in `goal 1`_ below). We +will then focus on the various new features needed to actually use multiple cores without explicitly using multithreading (`goal 2`_ below), up to -and including adapting some existing framework libraries like for +and including adapting some existing framework libraries, for example Twisted, Tornado, Stackless, or gevent (`goal 3`_ below). -In more details -=============== +In more detail +============== This is a call for financial help in implementing a version of PyPy able to use multiple processors in a single process, called PyPy-TM; and @@ -87,14 +87,14 @@ Armin Rigo and Remi Meier and possibly others. We currently estimate the final performance goal to be a slow-down of -25% to 40%, i.e. running a fully serial application would take between +25% to 40% from the current non-TM PyPy; i.e. running a fully serial application would take between 1.25 and 1.40x the time it takes in a regular PyPy. (This goal has been reached already in some cases, but we need to make this result more -broadly applicable.) We feel confident that it can work, in the -following sense: the performance of PyPy-TM running any suitable +broadly applicable.) We feel confident that the performance of PyPy-TM will +running any suitable application should scale linearly or close-to-linearly with the number of processors. This means that starting with two cores, such -applications should perform better than in a regular PyPy. (All numbers +applications should perform better than a non-TM PyPy. (All numbers presented here are comparing different versions of PyPy which all have the JIT enabled.) @@ -149,7 +149,7 @@ with a much smaller Hardware Transactional Memory (HTM) library based on hardware features and running on Haswell-generation processors. This has been attempted by Remi Meier recently. However, it seems that we -see scaling problems (as we expected them): the current generation of HTM +see the scaling problems as expected: the current generation of HTM processors is limited to run small-scale transactions. Even the default transaction size used in PyPy-STM is often too much for HTM; and reducing this size increases overhead without completely solving the @@ -162,15 +162,15 @@ generally. A CPU with support for the virtual memory described in this paper would certainly be better for running PyPy-HTM. -Another issue is sub-cache-line false conflicts (conflicts caused by two +Another issue in HTM is sub-cache-line false conflicts (conflicts caused by two independent objects that happens to live in the same cache line, which is usually 64 bytes). This is in contrast with the current PyPy-STM, which doesn't have false conflicts of this kind at all and might thus be -ultimately better for very-long-running transactions. None of the -papers we know of discusses this issue. +ultimately better for very-long-running transactions. We are not aware of +published research discussing issues of very-long-running transactions. Note that right now PyPy-STM has false conflicts within the same object, -e.g. within a list or a dictionary; but we can more easily do something +e.g. within a list or a dictionary; but we can easily do something about it (see `goal 2_`). Also, it might be possible in PyPy-HTM to arrange objects in memory ahead of time so that such conflicts are very rare; but we will never get a rate of exactly 0%, which might be @@ -179,20 +179,20 @@ .. _`Virtualizing Transactional Memory`: http://pages.cs.wisc.edu/~isca2005/papers/08A-02.PDF -Why do it with PyPy instead of CPython? +Why do TM with PyPy instead of CPython? --------------------------------------- While there have been early experiments on Hardware Transactional Memory with CPython (`Riley and Zilles (2006)`__, `Tabba (2010)`__), there has -been no recent one. The closest is an attempt using `Haswell on the +been none in the past few years. The closest is an attempt using `Haswell on the Ruby interpreter`__. None of these attempts tries to do the same using Software Transactional Memory. We would nowadays consider it possible to adapt our stmgc-c7 library for CPython, but it would be a lot of -work, starting from changing the reference-counting scheme. PyPy is +work, starting from changing the reference-counting garbage colleciton scheme. PyPy is better designed to be open to this kind of research. -But the best argument from an external point of view is probably that -PyPy has got a JIT to start with. It is thus starting from a better +However, the best argument from an objective point of view is probably that +PyPy has already implemented a JIT. It is thus starting from a better position in terms of performance, particularly for the long-running kind of programs that we target here. @@ -207,7 +207,7 @@ PyPy-TM will be slower than judicious usage of existing alternatives, based on multiple processes that communicate with each other in one way or another. The counter-argument is that TM is not only a cleaner -solution: there are cases in which it is not doable to organize (or +solution: there are cases in which it is not possilbe to organize (or retrofit) an existing program into the particular format needed for the alternatives. In particular, small quickly-written programs don't need the additional baggage of cross-process communication; and large @@ -217,35 +217,35 @@ rest of the program should work without changes. -Other platforms than the x86-64 Linux +Platforms other than the x86-64 Linux ------------------------------------- -The first thing to note is that the current solution depends on having a -huge address space available. If it were to be ported to any 32-bit -architecture, the limitation to 2GB or 4GB of address space would become -very restrictive: the way it works right now would further divide this +The current solution depends on having a +huge address space available. Porting to any 32-bit +architecture would quickly run into the limitation of a 2GB or 4GB of address space. +The way TM works right now would further divide this limit by N+1, where N is the number of segments. It might be possible to create partially different memory views for multiple threads that each access the same range of addresses; this would require extensions that are very OS-specific. We didn't investigate so far. -The current version, which thus only works on 64-bit, still relies +The current 64-bit version relies heavily on Linux- and clang-only features. We believe it is a suitable restriction: a lot of multi- and many-core servers commonly available are nowadays x86-64 machines running Linux. Nevertheless, non-Linux solutions appear to be possible as well. OS/X (and likely the various BSDs) seems to handle ``mmap()`` better than Linux does, and can remap individual pages of an existing mapping to various pages without hitting -a limit of 65536 like Linux. Windows might also have a way, although we -didn't measure yet; but the first issue with Windows would be to support -Win64, which the regular PyPy doesn't. +a limit of 65536 like Linux. Windows might also have a solution, although we +didn't measure yet; but first we would need a 64-bit Windows PyPy, which has +not seen much active support. -We will likely explore the OS/X way (as well as the Windows way if Win64 -support grows in PyPy), but this is not included in the scope of this -proposal. +We will likely explore the OS/X path (as well as the Windows path if Win64 +support grows in PyPy), but this is not part of this current +donation proposal. It might be possible to adapt the work done on x86-64 to the 64-bit -ARMv8 as well, but we didn't investigate so far. +ARMv8 as well. We have not investigated this so far. More readings _______________________________________________ pypy-commit mailing list [email protected] https://mail.python.org/mailman/listinfo/pypy-commit
