Re: [Python-ideas] The future of Python parallelism. The GIL. Subinterpreters. Actors.

Chris Angelico Sun, 15 Jul 2018 21:25:11 -0700

On Mon, Jul 16, 2018 at 1:21 PM, Nathaniel Smith <n...@pobox.com> wrote:
> On Sun, Jul 15, 2018 at 6:00 PM, Chris Angelico <ros...@gmail.com> wrote:
>> On Mon, Jul 16, 2018 at 10:31 AM, Nathaniel Smith <n...@pobox.com> wrote:
>>> On Sun, Jul 8, 2018 at 11:27 AM, David Foster <davidf...@gmail.com> wrote:
>>>> * The Actor model can be used with some effort via the “multiprocessing”
>>>> module, but it doesn’t seem that streamlined and forces there to be a
>>>> separate OS process per line of execution, which is relatively expensive.
>>>
>>> What do you mean by "the Actor model"? Just shared-nothing
>>> concurrency? (My understanding is that in academia it means
>>> shared-nothing + every thread/process/whatever gets an associated
>>> queue + queues are globally addressable + queues have unbounded
>>> buffering + every thread/process/whatever is implemented as a loop
>>> that reads messages from its queue and responds to them, with no
>>> internal concurrency. I don't know why this particular bundle of
>>> features is considered special. Lots of people seem to use it in
>>> looser sense though.)
>>
>> Shared-nothing concurrency is, of course, the very easiest way to
>> parallelize. But let's suppose you're trying to create an online
>> multiplayer game. Since it's a popular genre at the moment, I'll go
>> for a battle royale game (think PUBG, H1Z1, Fortnite, etc). A hundred
>> people enter; one leaves. The game has to let those hundred people
>> interact, which means that all hundred people have to be connected to
>> the same server. And you have to process everyone's movements,
>> gunshots, projectiles, etc, etc, etc, fast enough to be able to run a
>> server "tick" enough times per second - I would say 32 ticks per
>> second is an absolute minimum, 64 is definitely better. So what
>> happens when the processing required takes more than one CPU core for
>> 1/32 seconds? A shared-nothing model is either fundamentally
>> impossible, or a meaningless abstraction (if you interpret it to mean
>> "explicit queues/pipes for everything"). What would the "Actor" model
>> do here?
>
> "Shared-nothing" is a bit of jargon that means there's no *implicit*
> sharing; your threads can still communicate, the communication just
> has to be explicit. I don't know exactly what algorithms your
> hypothetical game needs, but they might be totally fine in a
> shared-nothing approach. It's not just for embarrassingly parallel
> problems.


Right, so basically it's the exact model that Python *already* has for
multiprocessing - once you go to separate processes, nothing is
implicitly shared, and everything has to be done with queues.

>> Ideally, I would like to be able to write my code as a set of
>> functions, then easily spin them off as separate threads, and have
>> them able to magically run across separate CPUs. Unicorns not being a
>> thing, I'm okay with warping my code a bit around the need for
>> parallelism, but I'm not sure how best to do that. Assume here that we
>> can't cheat by getting most of the processing work done with the GIL
>> released (eg in Numpy), and it actually does require Python-level
>> parallelism of CPU-heavy work.
>
> If you need shared-memory threads, on multiple cores, for CPU-bound
> logic, where the logic is implemented in Python, then yeah, you
> basically need a free-threaded implementation of Python. Jython is
> such an implementation. PyPy could be if anyone were interested in
> funding it [1], but apparently no-one is. Probably removing the GIL
> from CPython is impossible. (I'd be happy to be proven wrong.) Sorry I
> don't have anything better to report.

(This was a purely hypothetical example.)

There could be some interesting results from using the GIL only for
truly global objects, and then having other objects guarded by arena
locks. The trouble is that, in CPython, as soon as you reference any
read-only object from the globals, you need to raise its refcount.
ISTR someone mentioned something along the lines of
sys.eternalize(obj) to flag something as "never GC this thing, it no
longer has a refcount", which would then allow global objects to be
referenced in a truly read-only way (eg to call a function). Sadly,
I'm not expert enough to actually look into implementing it, but it
does seem like a very cool concept. It also fits into the "warping my
code a bit" category (eg eternalizing a small handful of key objects,
and paying the price of "well, now they can never be garbage
collected"), with the potential to then parallelize more easily.

> The good news is that there are many, many situations where you don't
> actually need "shared-memory threads, on multiple cores, for CPU-bound
> logic, where the logic is implemented in Python".

Oh absolutely. MOST of my parallelism requirements involve regular
Python threads, because they spend most of their time blocked on
something. That one is easy. The hassle comes when something MIGHT
need parallelism and might not, based on (say) how much data it has to
work with; for those kinds of programs, I would like to be able to
code it the simple way with minimal code overhead, but still able to
split over cores. And yes, I'm aware that it's never going to be
perfect, but the closer the better.

ChrisA
_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] The future of Python parallelism. The GIL. Subinterpreters. Actors.

Reply via email to