[Python-Dev] Re: Comments on PEP 554 (Multiple Interpreters in the Stdlib)

2020-04-22 Thread Eric Snow
FYI, I'm not ignoring you. :)  Life intervened.  I'll respond in the
next day or two.

-eric

On Tue, Apr 21, 2020 at 10:42 AM Mark Shannon  wrote:
>
> Hi,
>
> I'm generally in favour of PEP 554, but I don't think it is ready to be
> accepted in its current form.
>
> My main objection is that without per-subinterpeter GILs (SILs?) PEP 554
> provides no value over threading or multi-processing.
> Multi-processing provides true parallelism and threads provide shared
> memory concurrency.
>
> If per-subinterpeter GILs are possible then, and only then,
> sub-interpreters will provide true parallelism and (limited) shared
> memory concurrency.
>
> The problem is that we don't know whether we can implement
> per-subinterpeter GILs without too large a negative performance impact.
> I think we can, but we can't say so for certain.
>
> So, IMO, we should not accept PEP 554 until we know for sure that
> per-subinterpeter GILs can be implemented efficiently.
>
>
>
> Detailed critique
> -
>
> I don't see how `list_all()` can be both safe and accurate. The Java
> equivalent makes no guarantees of accuracy.
> Attempting to lock the list is likely to lead to deadlock and not
> locking it will lead to races; potentially dangerous ones.
> I think it would be best to drop this.
>
> `list_all_channels()`. See `list_all()` above.
>
> `.destroy()` is either misleading or unsafe.
> What does this do?
>
>  >>> is.destroy()
>  >>> is.run()
>
> If `run()` raises an exception then the interpreter must exist. Rename
> to `close()` perhaps?
>
> `Channel.interpreters` see `list_all()` and `list_all_channels()` above.
>
> How does `is_shareable()` work? Are you proposing some mechanism to
> transfer an object from one sub-interpreter to another? How would that
> work? If objects are not shared but serialized, why not use marshal or
> pickle instead of inventing a third serialization protocol?
>
> It would be clearer if channels only dealt with simple, contiguous
> binary data. As it stands the PEP doesn't state what form the received
> object will take.
> Once channels supporting the transfer of bytes objects work, then it is
> simple to pass more complex objects using pickle or marshal.
>
> Channels need a more detailed description of their lifespan. Ideally a
> state machine.
> For example:
> How does an interpreter detach from the receiving end of a channel that
> is never empty?
> What happens if an interpreter deletes the last reference to a non-empty
> channel? On the receiving end, or on the sending end?
>
> Isn't the lack of buffering in channels a recipe for deadlocks?
>
> What is the mechanism for reliably copying exceptions from one
> sub-interpreter to another in the `run()` method? If `run()` can raise
> an exception, why not let it return values?
>
>
> Cheers,
> Mark.
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/ZSE2G37E24YYLNMQKOQSBM46F7KLAOZF/
> Code of Conduct: http://python.org/psf/codeofconduct/
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/CEZXKXQTKWM7RX3CVOAFUZTHRYSERNCZ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Comments on PEP 554 (Multiple Interpreters in the Stdlib)

2020-04-22 Thread Ned Batchelder


On 4/21/20 12:32 PM, Mark Shannon wrote:

Hi,

I'm generally in favour of PEP 554, but I don't think it is ready to 
be accepted in its current form.



BTW, thanks for including the name of the PEP in the subject.  As a 
casual reader of this list, it's very helpful to have more than just the 
number, so I can decide whether to read into the deeper details.


--Ned.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/5J27XI3ILH3YARS6DZKXZDDS2DRF3BRY/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Comments on PEP 554 (Multiple Interpreters in the Stdlib)

2020-04-22 Thread Rob Cliffe via Python-Dev



On 22/04/2020 19:40, Ned Batchelder wrote:


On 4/21/20 12:32 PM, Mark Shannon wrote:

Hi,

I'm generally in favour of PEP 554, but I don't think it is ready to 
be accepted in its current form.



BTW, thanks for including the name of the PEP in the subject.  As a 
casual reader of this list, it's very helpful to have more than just 
the number, so I can decide whether to read into the deeper details.

Hear, hear!  This is a point which is always worth bearing in mind.
Whenever I send an email to my boss (usually containing a software 
update) I try to include all relevant "keywords" in the subject so that 
he can easily search for it months later.

In other words I try to make my emails a valuable resource.
Rob Cliffe
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/Q3N63NCWL4CBGYQ3GYHZW5SRYGJBSMLS/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Comments on PEP 554 (Multiple Interpreters in the Stdlib)

2020-04-22 Thread Kyle Stanley
Mark Shannon wrote:
> If `run()` can raise
> an exception, why not let it return values?

If there's not an implementation detail that makes this impractical,
I'd like to give my +1 on the `Interpreter.run()` method returning
values. From a usability perspective, it seems incredibly convenient
to have the ability to call a function in a subinterpreter, and then
directly get the return value instead of having to send the result
through a channel (for more simple use cases).

Also, not that the API for subinterpreters needs to be at all similar
to asyncio, but it would be consistent with `asyncio.run()` with
regards to being able to return values. Although one could certainly
argue that `asyncio.run()` and `Interpreter.run()` will have
significantly different use cases; with `asyncio.run()` being intended
as a primary entry point for a program, and `Interpreter.run()` being
used to execute arbitrary code in a single interpreter.

On Tue, Apr 21, 2020 at 12:45 PM Mark Shannon  wrote:
>
> Hi,
>
> I'm generally in favour of PEP 554, but I don't think it is ready to be
> accepted in its current form.
>
> My main objection is that without per-subinterpeter GILs (SILs?) PEP 554
> provides no value over threading or multi-processing.
> Multi-processing provides true parallelism and threads provide shared
> memory concurrency.
>
> If per-subinterpeter GILs are possible then, and only then,
> sub-interpreters will provide true parallelism and (limited) shared
> memory concurrency.
>
> The problem is that we don't know whether we can implement
> per-subinterpeter GILs without too large a negative performance impact.
> I think we can, but we can't say so for certain.
>
> So, IMO, we should not accept PEP 554 until we know for sure that
> per-subinterpeter GILs can be implemented efficiently.
>
>
>
> Detailed critique
> -
>
> I don't see how `list_all()` can be both safe and accurate. The Java
> equivalent makes no guarantees of accuracy.
> Attempting to lock the list is likely to lead to deadlock and not
> locking it will lead to races; potentially dangerous ones.
> I think it would be best to drop this.
>
> `list_all_channels()`. See `list_all()` above.
>
> `.destroy()` is either misleading or unsafe.
> What does this do?
>
>  >>> is.destroy()
>  >>> is.run()
>
> If `run()` raises an exception then the interpreter must exist. Rename
> to `close()` perhaps?
>
> `Channel.interpreters` see `list_all()` and `list_all_channels()` above.
>
> How does `is_shareable()` work? Are you proposing some mechanism to
> transfer an object from one sub-interpreter to another? How would that
> work? If objects are not shared but serialized, why not use marshal or
> pickle instead of inventing a third serialization protocol?
>
> It would be clearer if channels only dealt with simple, contiguous
> binary data. As it stands the PEP doesn't state what form the received
> object will take.
> Once channels supporting the transfer of bytes objects work, then it is
> simple to pass more complex objects using pickle or marshal.
>
> Channels need a more detailed description of their lifespan. Ideally a
> state machine.
> For example:
> How does an interpreter detach from the receiving end of a channel that
> is never empty?
> What happens if an interpreter deletes the last reference to a non-empty
> channel? On the receiving end, or on the sending end?
>
> Isn't the lack of buffering in channels a recipe for deadlocks?
>
> What is the mechanism for reliably copying exceptions from one
> sub-interpreter to another in the `run()` method? If `run()` can raise
> an exception, why not let it return values?
>
>
> Cheers,
> Mark.
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/ZSE2G37E24YYLNMQKOQSBM46F7KLAOZF/
> Code of Conduct: http://python.org/psf/codeofconduct/
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/YSZBQEES7LCBANVIRIUXSKDHZGL3Q2F6/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Comments on PEP 554 (Multiple Interpreters in the Stdlib)

2020-04-28 Thread Eric Snow
On Tue, Apr 21, 2020 at 10:42 AM Mark Shannon  wrote:
> I'm generally in favour of PEP 554, but I don't think it is ready to be
> accepted in its current form.

Yay(ish)! :)

> My main objection is that without per-subinterpeter GILs (SILs?) PEP 554
> provides no value over threading or multi-processing.
> Multi-processing provides true parallelism and threads provide shared
> memory concurrency.

I disagree. :)  I believe there are merits to the kind of programming
one can do via subinterpreter + channels (i.e. threads with opt-in
sharing).  I would also like to get broader community exposure to the
subinterpreter functionality sooner rather than later.  Getting the
Python API out there now will help folks get ready sooner for the
(later?) switch to per-interpreter GIL.  As Antoine put it, it allows
folks to start experimenting.  I think there is enough value in all
that to warrant landing PEP 554 in 3.9 even if per-interpreter GIL
only happens in 3.10.

> If per-subinterpeter GILs are possible then, and only then,
> sub-interpreters will provide true parallelism and (limited) shared
> memory concurrency.
>
> The problem is that we don't know whether we can implement
> per-subinterpeter GILs without too large a negative performance impact.
> I think we can, but we can't say so for certain.

I think we can as well, but I'd like to hear more about what obstacles
you think we might run into.

> So, IMO, we should not accept PEP 554 until we know for sure that
> per-subinterpeter GILs can be implemented efficiently.
>
>
>
> Detailed critique
> -
>
> I don't see how `list_all()` can be both safe and accurate. The Java
> equivalent makes no guarantees of accuracy.
> Attempting to lock the list is likely to lead to deadlock and not
> locking it will lead to races; potentially dangerous ones.
> I think it would be best to drop this.
>
> `list_all_channels()`. See `list_all()` above.
>
> [out of order] `Channel.interpreters` see `list_all()` and 
> `list_all_channels()` above.

I'm not sure I understand your objection.  If a user calls the
function then they get a list.  If that list becomes outdated in the
next minute or the next millisecond, it does not impact the utility of
having that list.  For example, without that list how would one make
sure all other interpreters have been destroyed?

> `.destroy()` is either misleading or unsafe.
> What does this do?
>
>  >>> is.destroy()
>  >>> is.run()
>
> If `run()` raises an exception then the interpreter must exist. Rename
> to `close()` perhaps?

I see what you mean.  "Interpreter" objects are wrappers rather than
the actual interpreters, but that might not stop folks from thinking
otherwise.  I agrree that "close" may communicate that nature better.
I guess so would "finalize", which is what the C-API calls it.  Then
again, you can't tell an object to "destroy" itself, can you?  It just
isn't clear what you are destroying (nor why we're so destructive
).

So "close" aligns with other similarly purposed methods out there,
while "finalize" aligns with the existing C-API and also elevates the
complex nature of what happens.  If we change the name from "destroy"
then I'd lean toward "finalize".

FWIW, in your example above, the is.run() call would raise a
RuntimeError saying that it couldn't find an interpreter with "that"
ID.

> How does `is_shareable()` work? Are you proposing some mechanism to
> transfer an object from one sub-interpreter to another? How would that
> work?

The PEP purposefully does not proscribe how "is_shareable()" works.
That depends on the implementation for channels, for which there could
be several, and which will likely differ based on the Python
implementation.  Likewise the PEP does not dictate how channels work
(i.e. how objects are "shared").  That is an implementation detail.
We could talk about how we've implemented PEP 554, but that's not
highly relevant to the merits of this proposal (its API in
particular).

> If objects are not shared but serialized, why not use marshal or
> pickle instead of inventing a third serialization protocol?

Again, that's an implementation detail.  The PEP does not specify that
objects are actually shared or not.  In fact, I was careful to say:

This does not necessarily mean that the actual objects will be
shared.  Insead, it means that the objects' underlying data will
be shared in a cross-interpreter way, whether via a proxy, a
copy, or some other means.

> It would be clearer if channels only dealt with simple, contiguous
> binary data. As it stands the PEP doesn't state what form the received
> object will take.

You're right.  The PEP is not clear enough about what object an
interpreter will receive for a given sent object.  The intent is that
it will be the same type with the same data.  This might not always be
possible, so there may be cases where we allow for a compatible proxy.
Either way, I'll clarify this point in the PEP.

> Once channels supporting the transf

[Python-Dev] Re: Comments on PEP 554 (Multiple Interpreters in the Stdlib)

2020-04-28 Thread Eric Snow
On Wed, Apr 22, 2020 at 7:40 PM Kyle Stanley  wrote:
> If there's not an implementation detail that makes this impractical,
> I'd like to give my +1 on the `Interpreter.run()` method returning
> values. From a usability perspective, it seems incredibly convenient
> to have the ability to call a function in a subinterpreter, and then
> directly get the return value instead of having to send the result
> through a channel (for more simple use cases).

The PEP only proposes the ability to run code (a string treated as a
script to run in the __main__ module) in an interpreter.  See
PyRun_StringFlags() in the C-API.  Passing a function to
Interpreter.run() is out of scope.  So returning anything doesn't make
much sense.

> Also, not that the API for subinterpreters needs to be at all similar
> to asyncio, but it would be consistent with `asyncio.run()` with
> regards to being able to return values. Although one could certainly
> argue that `asyncio.run()` and `Interpreter.run()` will have
> significantly different use cases; with `asyncio.run()` being intended
> as a primary entry point for a program, and `Interpreter.run()` being
> used to execute arbitrary code in a single interpreter.

While somewhat different, this is something we should keep in mind.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/EHU5DQJ53MNMQNUXHBL5HQ2B7CSWPRZI/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Comments on PEP 554 (Multiple Interpreters in the Stdlib)

2020-04-29 Thread Mark Shannon

Hi,

On 29/04/2020 4:02 am, Eric Snow wrote:

On Tue, Apr 21, 2020 at 10:42 AM Mark Shannon  wrote:

I'm generally in favour of PEP 554, but I don't think it is ready to be
accepted in its current form.


Yay(ish)! :)


My main objection is that without per-subinterpeter GILs (SILs?) PEP 554
provides no value over threading or multi-processing.
Multi-processing provides true parallelism and threads provide shared
memory concurrency.


I disagree. :)  I believe there are merits to the kind of programming
one can do via subinterpreter + channels (i.e. threads with opt-in
sharing).  I would also like to get broader community exposure to the
subinterpreter functionality sooner rather than later.  Getting the
Python API out there now will help folks get ready sooner for the
(later?) switch to per-interpreter GIL.  As Antoine put it, it allows
folks to start experimenting.  I think there is enough value in all
that to warrant landing PEP 554 in 3.9 even if per-interpreter GIL
only happens in 3.10.


You can already do CSP with multiprocessing, plus you get true parallelism.
The question the PEP needs to answer is "what do sub-interpreters offer 
that other forms of concurrency don't offer".


https://gist.github.com/markshannon/79cace3656b40e21b7021504daee950c

This table summarizes the core features of various approaches to 
concurrency and compares them to "ideal" CSP. There are lot of question 
marks in the PEP 544 column. The PEP needs to address those.


As it stands, multiprocessing a better fit for CSP than PEP 554.

IMO, sub-interpreters only become a useful option for concurrency if 
they allow true parallelism and are not much more expensive than threads.





If per-subinterpeter GILs are possible then, and only then,
sub-interpreters will provide true parallelism and (limited) shared
memory concurrency.

The problem is that we don't know whether we can implement
per-subinterpeter GILs without too large a negative performance impact.
I think we can, but we can't say so for certain.


I think we can as well, but I'd like to hear more about what obstacles
you think we might run into.


As an example, accessing common objects like `None` and `int` will need 
extra indirection.

That *might* be an acceptable cost, or it might not. We don't know.

I can't tell you about the unknown unknowns :)




So, IMO, we should not accept PEP 554 until we know for sure that
per-subinterpeter GILs can be implemented efficiently.



Detailed critique
-

I don't see how `list_all()` can be both safe and accurate. The Java
equivalent makes no guarantees of accuracy.
Attempting to lock the list is likely to lead to deadlock and not
locking it will lead to races; potentially dangerous ones.
I think it would be best to drop this.

`list_all_channels()`. See `list_all()` above.

[out of order] `Channel.interpreters` see `list_all()` and 
`list_all_channels()` above.


I'm not sure I understand your objection.  If a user calls the
function then they get a list.  If that list becomes outdated in the
next minute or the next millisecond, it does not impact the utility of
having that list.  For example, without that list how would one make
sure all other interpreters have been destroyed?


Do you not see the contradiction?
You say that it's OK if the list is outdated immediately, and then ask 
how one would make sure all other interpreters have been destroyed.


With true parallelism, the list could be out of date before it is even 
completed.





`.destroy()` is either misleading or unsafe.
What does this do?

  >>> is.destroy()
  >>> is.run()

If `run()` raises an exception then the interpreter must exist. Rename
to `close()` perhaps?


I see what you mean.  "Interpreter" objects are wrappers rather than
the actual interpreters, but that might not stop folks from thinking
otherwise.  I agrree that "close" may communicate that nature better.
I guess so would "finalize", which is what the C-API calls it.  Then
again, you can't tell an object to "destroy" itself, can you?  It just
isn't clear what you are destroying (nor why we're so destructive
).

So "close" aligns with other similarly purposed methods out there,
while "finalize" aligns with the existing C-API and also elevates the
complex nature of what happens.  If we change the name from "destroy"
then I'd lean toward "finalize".


I don't see why C-API naming conventions would take precedence over 
Python naming conventions for naming a Python method.




FWIW, in your example above, the is.run() call would raise a
RuntimeError saying that it couldn't find an interpreter with "that"
ID.


How does `is_shareable()` work? Are you proposing some mechanism to
transfer an object from one sub-interpreter to another? How would that
work?


The PEP purposefully does not proscribe how "is_shareable()" works.
That depends on the implementation for channels, for which there could
be several, and which will likely differ based on the Python
implementation.  Likewise the

[Python-Dev] Re: Comments on PEP 554 (Multiple Interpreters in the Stdlib)

2020-04-29 Thread Eric Snow
Thanks, Mark.  Responses are in-line below.

-eric

On Wed, Apr 29, 2020 at 6:08 AM Mark Shannon  wrote:
> You can already do CSP with multiprocessing, plus you get true parallelism.
> The question the PEP needs to answer is "what do sub-interpreters offer
> that other forms of concurrency don't offer".
>
> https://gist.github.com/markshannon/79cace3656b40e21b7021504daee950c
>
> This table summarizes the core features of various approaches to
> concurrency and compares them to "ideal" CSP. There are lot of question
> marks in the PEP 544 column. The PEP needs to address those.
>
> As it stands, multiprocessing a better fit for CSP than PEP 554.
>
> IMO, sub-interpreters only become a useful option for concurrency if
> they allow true parallelism and are not much more expensive than threads.

While I have a different opinion here, especially if we consider
trajectory, I really want to keep discussion focused on the proposed
API in the PEP.  Honestly I'm considering taking up the recommendation
to add a new PEP about making subinterpreters official.  I never meant
for that to be more than a minor point for PEP 554.

> > I think we can as well, but I'd like to hear more about what obstacles
> > you think we might run into.
>
> As an example, accessing common objects like `None` and `int` will need
> extra indirection.
> That *might* be an acceptable cost, or it might not. We don't know.

ack

> I can't tell you about the unknown unknowns :)

:)

> > I'm not sure I understand your objection.  If a user calls the
> > function then they get a list.  If that list becomes outdated in the
> > next minute or the next millisecond, it does not impact the utility of
> > having that list.  For example, without that list how would one make
> > sure all other interpreters have been destroyed?
>
> Do you not see the contradiction?
> You say that it's OK if the list is outdated immediately, and then ask
> how one would make sure all other interpreters have been destroyed.
>
> With true parallelism, the list could be out of date before it is even
> completed.

I don't see why that would be a problem in practice.  Folks already
have to deal with that situation in many other venues in Python (e.g.
threading.enumerate()).  Not having the list at all would more
painful.

> > So "close" aligns with other similarly purposed methods out there,
> > while "finalize" aligns with the existing C-API and also elevates the
> > complex nature of what happens.  If we change the name from "destroy"
> > then I'd lean toward "finalize".
>
> I don't see why C-API naming conventions would take precedence over
> Python naming conventions for naming a Python method.

Naming conventions aren't as important if we focus just on
communicating intent.  Maybe it's just me, but "close" does not
reflect the complexity that "finalize" does.

Regardless, if it is called "close" then folks can use
contextlib.closing() with it.  That's enough to sell me on it.

> Ok, let me rephrase. What does "is_shareable()" do?
> Is `None` sharable? What about `int`?

It's up to the Python implementation to decide if something is
shareable or not.  In the case of CPython, PEP 554 says: "Initially
this will include None, bytes, str, int, and channels. Further types
may be supported later."

> Its not an implementation detail. The user needs to know the *exact* set
> of objects that can be communicated. Using marshal or pickle provides
> that information.

The point of is_shareable() is to expose that information, though not
as a list.  Why would users want that full list?  It could be huge,
BTW.  If you are talking about documentation then yeah, we would
definitely document which types CPython considers shareable.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/IEMXNKSOZT23OEXFWF3VNJMYSRV7OCUU/
Code of Conduct: http://python.org/psf/codeofconduct/