Robert Hancock hosted a bof at pycon about concurrency and multiprocessing. I went there looking to find out how other people were doing things, especially looking for information about how other languages handled things. It would be nice to kill the GIL, if only we knew of a brilliant way to do this.
Unfortunately, I was one year to late for this discussion. This is what Robert Hancock David Beazley, Peter Portante and others discussed at Pycon _last year_. So I asked Robert Hancock for the notes he took then. (I continue after this forwarded message) ------- Forwarded Message Return-Path: hancock.rob...@gmail.com Delivery-Date: Mon Mar 14 17:09:51 2011 Return-Path: <hancock.rob...@gmail.com> Subject: Re: please send me the notes you took last year To: Laura Creighton <l...@openend.se> These are the books that I mentioned: Machine Learning: An Algorithmic Perspective http://www.amazon.com/gp/product/1420067184 <http://www.amazon.com/gp/product/1420067184> I found this more approachable than the Bishop and a number of examples are in Python. Introduction to Data Mining http://www.amazon.com/gp/product/0321321367 <http://www.amazon.com/gp/product/0321321367> I've only started this, but it is nice with David Mease's Google Tech Talk series. http://www.youtube.com/watch?v=zRsMEl6PHhM 1. Make all IO non-blocking and mediate the processes like greenlets. This does not allow you take advantage of the OS level thread scheduler which is far more sophisticated than greenlets. See the Linux kernel specifications for the details of the multi-level feedback queue. 2. Construct a multi-level feedback queue within Python. This is extraordinarily complicated and complex to implement. Why duplicate what already exists? 3. Do we need to maintain compatibility with being able to call out to C functions? The primary complaint about the GIL is that it does not efficiently handle CPU bound processes and multi-cores. Running sequential processes in threads on multi-cores can actually slow down the processes. 4. Who has already solved this problem as part of the language? - Erlang (No one knew the nitty gritty details.) - Go - based on Tony Hoare's CSP and the work done on Plan 9 at Bell Labs. Uses the system scheduler and creates its own mini-threads (4k). Need to investiagate the source code on line. Goroutines do not have OS thread affinity; they can multiplex over multiple threads. - Java - Early on Java used several versions of Greenlets, but now uses system threads. The JVM punts to the OS. Conclusions -------------- 1. Do not reinvent the wheel! Many people have worked decades on this problem. Leverage thier expertise. 2. Coroutines are frequently better than threads, but do scale and each coroutine must me restarted in the thread where it was spawned. See greenlet.c. Greenlets are also chained and have mutual dependencies. The order of execution is arbitrary with not method for priorities. 3. Investigate if there is an alternative to the current method of calling external C objects. 4. Dave did a POC on priorities: http://dabeaz.blogspot.com/2010/02/revisiting-thread-priorities-and-new.html 5. Everyone agreed that some type of priority mechanism is a good idea, but wanted to see what Unladen Swallow does. (As of March 2011 Google is no longer actively developing this project.) References - ----------- Dave Beazley - GIL Wars Dave Beazley - Yieldable Threads http://www.dabeaz.com/blog.html Linux Kernel http://goo.gl/RkxVs Erlang Go - golang.org CSP - Tony Hoare http://www.usingcsp.com/cspbook.pdf I spoke with Peter Portante yesterday, and he would be very interested in participating even though he has very little free time. Peter works at HP and worked on their OS threading model. Also, see his Pycon 2010 talk on non-blocking IO and the 2011 talk on co-routines. Let me know if you have any questions. Bob Hancock Blog - www.bobhancock.org Twitter - bob_hancock and nycgtug ------- End of Forwarded Message And, indeed, Peter Portante is very interested in thinking about doing without the GIL. He's already sent me this: Date: Sun, 13 Mar 2011 16:42:00 -0400 To: Laura Creighton <l...@openend.se> From: Peter Portante <peter.a.porta...@gmail.com> Subject: Re: [pypy-dev] possibly of use for our documentation Return-Path: peter.a.porta...@gmail.com Delivery-Date: Sun Mar 13 21:42:21 2011 Return-Path: <peter.a.porta...@gmail.com> Hi Laura, Just left pycon and heard about talks of pypy removing the gil. I work on tru64 unix's thread library for 8 years. If there is any thing I can do to help with this effort, please let me know. Thanks, -peter ------------------------- Note: I have never promised anybody anything. This was a 'please educate me appeal'. But Bob Hancock is coming back this afternoon to talk with us. Anybody got any questions they want to make sure I ask him? Laura _______________________________________________ pypy-dev@codespeak.net http://codespeak.net/mailman/listinfo/pypy-dev