In preforking (as in the case of a process pool), you use less memory and
reduce the tracing/optimization time on the code since the same PyPy
instance already traced and optimized that part of the code.

‫בתאריך יום ב׳, 20 ביוני 2016 ב-20:16 מאת ‪Maciej Fijalkowski‬‏ <‪
fij...@gmail.com‬‏>:‬

> no, you misunderstood me:
>
> if you want to use multiple processes, you not gonna start a new one
> per thing to do. You'll have a process pool and use that. Also, if you
> don't use multiprocessing, you don't use pickling, you use something
> sane for communication. The PyParallels essentially allows read-only
> access to the global state, but read-only is ill defined and ill
> enforced (especially in the case of cpy extensions) in Python. So what
> do you get as opposed to multiple processing?
>
> On Mon, Jun 20, 2016 at 6:42 PM, Omer Katz <omer.d...@gmail.com> wrote:
> > Let's review what forking does in Python from a 10,000ft view:
> > 1) It pickles the current state of the process.
> > 2) Starts a new Python process
> > 3) Unpickles the current state of the process
> > There are a lot more memory allocations when forking comparing to
> starting a
> > new thread. That makes forking unsuitable for small workloads.
> > I'm guessing that PyPy does not save the trace/optimized ASM of the
> forked
> > process in the parent process so each time you start a new process you
> have
> > to trace again which makes small workloads even less suitable and even
> large
> > processing batches will need to be traced again.
> >
> > In case of pre-forking servers, each PyPy instance has to trace and
> optimize
> > the same code when there is no reason. Threads would allow us to reduce
> > warmup time for this case. It will also consume less memory.
> >
> > ‫בתאריך יום ב׳, 20 ביוני 2016 ב-17:47 מאת ‪Maciej Fijalkowski‬‏
> > <‪fij...@gmail.com‬‏>:‬
> >>
> >> so quick question - what's the win compared to multiple processes?
> >>
> >> On Mon, Jun 20, 2016 at 8:51 AM, Omer Katz <omer.d...@gmail.com> wrote:
> >> > Hi all,
> >> > There was an experiment based on CPython's code called PyParallel that
> >> > allows running threads in parallel without STM and modifying source
> code
> >> > of
> >> > both Python and C extensions. The only limitation is that they
> disallow
> >> > mutation of global state in parallel context.
> >> > I briefly mentioned it before on PyPy's freenode channel.
> >> > I'd like to discuss why the approach is useful, how it can benefit
> PyPy
> >> > users and how can it be implemented.
> >> > Allowing to run in parallel without mutating global state can help
> >> > servers
> >> > use each thread to handle a request. It can also allow to log in
> >> > parallel or
> >> > send an HTTP request (or an AMQP message) without sharing the response
> >> > with
> >> > the main thread. This is useful in some cases and since PyParallel
> >> > managed
> >> > to keep the same semantics it (shouldn't) break CPyExt.
> >> > If we keep to the following rules:
> >> >
> >> > No global state mutation is allowed
> >> > No new keywords or code modifications required
> >> > No CPyExt code is allowed (for now)
> >> >
> >> > I believe that users can somewhat benefit from this implementation if
> >> > done
> >> > correctly.
> >> > As for implementation, if we can trace the code running in the thread
> >> > and
> >> > ensure it's not mutating global state and that CPyExt is never used
> >> > during
> >> > the thread's course we can simply release the GIL when such a thread
> is
> >> > run.
> >> > That requires less knowledge than using STM and less code
> modifications.
> >> > However I think that attempting to do so will introduce the same issue
> >> > with
> >> > caching traces (Armin am I correct here?).
> >> >
> >> > As for CPyExt, we could copy the same code modifications that
> >> > PyParallels
> >> > did but I suspect that it will be so slow that the benefit of running
> in
> >> > parallel will be completely lost for all cases but very long threads.
> >> >
> >> > Is what I'm suggesting even possible? How challenging will it be?
> >> >
> >> > Thanks,
> >> > Omer Katz.
> >> >
> >> > _______________________________________________
> >> > pypy-dev mailing list
> >> > pypy-dev@python.org
> >> > https://mail.python.org/mailman/listinfo/pypy-dev
> >> >
>
_______________________________________________
pypy-dev mailing list
pypy-dev@python.org
https://mail.python.org/mailman/listinfo/pypy-dev

Reply via email to