Re: [Python-Dev] PEP 553; the interaction between $PYTHONBREAKPOINT and -E
I concur with Antoine, please don't add a special case for -E. But it seems like you already agreed with that :-) Victor Le 5 oct. 2017 05:33, "Barry Warsaw" a écrit : > On Oct 4, 2017, at 21:52, Nick Coghlan wrote: > > > >> Unfortunately we probably won’t really get a good answer in practice > until Python 3.7 is released, so maybe I just choose one and document that > the behavior of PYTHONBREAKPOINT under -E is provision for now. If that’s > acceptable, then I would just treat -E for PYTHONBREAKPOINT the same as all > other environment variables, and we’ll see how it goes. > > > > I'd be fine with this as the main reason I wanted PYTHONBREAKPOINT=0 > > was for pre-merge CI systems, and those tend to have tightly > > controlled environment settings, so you don't need to rely on -E or -I > > when running your tests. > > > > That said, it may also be worth considering a "-X nobreakpoints" > > option (and then -I could imply "-E -s -X nobreakpoints"). > > Thanks for the feedback Nick. For now we’ll go with the standard behavior > of -E and see how it goes. We can always add a -X later. > > Cheers, > -Barry > > > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > victor.stinner%40gmail.com > > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 553; the interaction between $PYTHONBREAKPOINT and -E
04.10.17 21:06, Barry Warsaw пише: Victor brings up a good question in his review of the PEP 553 implementation. https://github.com/python/cpython/pull/3355 https://bugs.python.org/issue31353 The question is whether $PYTHONBREAKPOINT should be ignored if -E is given? I think it makes sense for $PYTHONBREAKPOINT to be sensitive to -E, but in thinking about it some more, it might make better sense for the semantics to be that when -E is given, we treat it like PYTHONBREAKPOINT=0, i.e. disable the breakpoint, rather than fallback to the `pdb.set_trace` default. My thinking is this: -E is often used in production environments to prevent stray environment settings from affecting the Python process. In those environments, you probably also want to prevent stray breakpoints from stopping the process, so it’s more helpful to disable breakpoint processing when -E is given rather than running pdb.set_trace(). If you have a strong opinion either way, please follow up here, on the PR, or on the bug tracker. What if make the default value depending on the debug level? In debug mode it is "pdb.set_trace", in optimized mode it is "0". Then in production environments you can use -E -O for ignoring environment settings and disable breakpoints. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 554 v3 (new interpreters module)
On Tue, Oct 3, 2017 at 8:55 AM, Antoine Pitrou wrote: > I think we need a sharing protocol, not just a flag. We also need to > think carefully about that protocol, so that it does not imply > unnecessary memory copies. Therefore I think the protocol should be > something like the buffer protocol, that allows to acquire and release > a set of shared memory areas, but without imposing any semantics onto > those memory areas (each type implementing its own semantics). And > there needs to be a dedicated reference counting for object shares, so > that the original object can be notified when all its shares have > vanished. I've come to agree. :) I actually came to the same conclusion tonight before I'd been able to read through your message carefully. My idea is below. Your suggestion about protecting shared memory areas is something to discuss further, though I'm not sure it's strictly necessary yet (before we stop sharing the GIL). On Wed, Oct 4, 2017 at 7:41 PM, Nick Coghlan wrote: > Having the sending interpreter do the INCREF just changes the problem > to be a memory leak waiting to happen rather than an access-after-free > issue, since the problematic non-synchronised scenario then becomes: > > * thread on CPU A has two references (ob_refcnt=2) > * it sends a reference to a thread on CPU B via a channel > * thread on CPU A releases its reference (ob_refcnt=1) > * updated ob_refcnt value hasn't made it back to the shared memory cache yet > * thread on CPU B releases its reference (ob_refcnt=1) > * both threads have released their reference, but the refcnt is still > 1 -> object leaks! > > We simply can't have INCREFs and DECREFs happening in different > threads without some way of ensuring cache coherency for *both* > operations - otherwise we risk either the refcount going to zero when > it shouldn't, or *not* going to zero when it should. > > The current CPython implementation relies on the process global GIL > for that purpose, so none of these problems will show up until you > start trying to replace that with per-interpreter locks. > > Free threaded reference counting relies on (expensive) atomic > increments & decrements. Right. I'm not sure why I was missing that, but I'm clear now. Below is a rough idea of what I think may work instead (the result of much tossing and turning in bed*). While we're still sharing a GIL between interpreters: Channel.send(obj): # in interp A incref(obj) if type(obj).tp_share == NULL: raise ValueError("not a shareable type") ch.objects.append(obj) Channel.recv(): # in interp B orig = ch.objects.pop(0) obj = orig.tp_share() return obj bytes.tp_share(): return self After we move to not sharing the GIL between interpreters: Channel.send(obj): # in interp A incref(obj) if type(obj).tp_share == NULL: raise ValueError("not a shareable type") set_owner(obj) # obj.owner or add an obj -> interp entry to global table ch.objects.append(obj) Channel.recv(): # in interp B orig = ch.objects.pop(0) obj = orig.tp_share() set_shared(obj, orig) # add to a global table return obj bytes.tp_share(): obj = blank_bytes(len(self)) obj.ob_sval = self.ob_sval # hand-wavy memory sharing return obj bytes.tp_free(): # under no-shared-GIL: # most of this could be pulled into a macro for re-use orig = lookup_shared(self) if orig != NULL: current = release_LIL() interp = lookup_owner(orig) acquire_LIL(interp) decref(orig) release_LIL(interp) acquire_LIL(current) # clear shared/owner tables # clear/release self.ob_sval free(self) The CIV approach could be facilitated through something like a new SharedBuffer type, or through a separate BufferViewChannel, etc. Most notably, this approach avoids hard-coding specific type support into channels and should work out fine under no-shared-GIL subinterpreters. One nice thing about the tp_share slot is that it makes it much easier (along with C-API for managing the global owned/shared tables) to implement other types that are legal to pass through channels. Such could be provided via extension modules. Numpy arrays could be made to support it, if that's your thing. Antoine could give tp_share to locks and semaphores. :) Of course, any such types would have to ensure that they are actually safe to share between intepreters without a GIL between them... For PEP 554, I'd only propose the tp_share slot and its use in Channel.send()/.recv(). The parts related to global tables and memory sharing and tp_free() wouldn't be necessary until we stop sharing the GIL between interpreters. However, I believe that tp_share would make us ready for that. -eric * I should know by now that some ideas sound better in the middle of the night than they do the next day, but this idea is keeping me awake so I'll risk it! :) ___ Python-Dev
Re: [Python-Dev] PEP 554 v3 (new interpreters module)
On 5 October 2017 at 18:45, Eric Snow wrote: > After we move to not sharing the GIL between interpreters: > > Channel.send(obj): # in interp A > incref(obj) > if type(obj).tp_share == NULL: > raise ValueError("not a shareable type") > set_owner(obj) # obj.owner or add an obj -> interp entry to global > table > ch.objects.append(obj) > > Channel.recv(): # in interp B > orig = ch.objects.pop(0) > obj = orig.tp_share() > set_shared(obj, orig) # add to a global table > return obj > This would be hard to get to work reliably, because "orig.tp_share()" would be running in the receiving interpreter, but all the attributes of "orig" would have been allocated by the sending interpreter. It gets more reliable if it's *Channel.send* that calls tp_share() though, but moving the call to the sending side makes it clear that a tp_share protocol would still need to rely on a more primitive set of "shareable objects" that were the permitted return values from the tp_share call. And that's the real pay-off that comes from defining this in terms of the memoryview protocol: Py_buffer structs *aren't* Python objects, so it's only a regular C struct that gets passed across the interpreter boundary (the reference to the original objects gets carried along passively as part of the CIV - it never gets *used* in the receiving interpreter). > bytes.tp_share(): > obj = blank_bytes(len(self)) > obj.ob_sval = self.ob_sval # hand-wavy memory sharing > return obj > This is effectively reinventing memoryview, while trying to pretend it's an ordinary bytes object. Don't reinvent memoryview :) > bytes.tp_free(): # under no-shared-GIL: > # most of this could be pulled into a macro for re-use > orig = lookup_shared(self) > if orig != NULL: > current = release_LIL() > interp = lookup_owner(orig) > acquire_LIL(interp) > decref(orig) > release_LIL(interp) > acquire_LIL(current) > # clear shared/owner tables > # clear/release self.ob_sval > free(self) > I don't think we should be touching the behaviour of core builtins solely to enable message passing to subinterpreters without a shared GIL. The simplest possible variant of CIVs that I can think of would be able to avoid that outcome by being a memoryview subclass, since they just need to hold the extra reference to the original interpreter, and include some logic to swtich interpreters at the appropriate time. That said, I think there's definitely a useful design question to ask in this area, not about bytes (which can be readily represented by a memoryview variant in the receiving interpreter), but about *strings*: they have a more complex internal layout than bytes objects, but as long as the receiving interpreter can make sure that the original string continues to exist, then you could usefully implement a "strview" type to avoid having to go through an encode/decode cycle just to pass a string to another subinterpreter. That would provide a reasonable compelling argument that CIVs *shouldn't* be implemented as memoryview subclasses, but instead defined as *containing* a managed view of an object owned by a different interpreter. That way, even if the initial implementation only supported CIVs that contained a memoryview instance, we'd have the freedom to define other kinds of views later (such as strview), while being able to reuse the same CIV machinery. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 553; the interaction between $PYTHONBREAKPOINT and -E
> What if make the default value depending on the debug level? In debug mode > it is "pdb.set_trace", in optimized mode it is "0". Then in production > environments you can use -E -O for ignoring environment settings and disable > breakpoints. I don't know what is the best option, but I dislike adding two options, PYTHONBREAKPOINT and -X nobreakpoint, for the same features. I would become complicated to know which option has the priority. I would prefer a generic "release mode" option. In the past, I proposed the opposite: a "developer mode": https://mail.python.org/pipermail/python-ideas/2016-March/039314.html "python3 -X dev" would be an "alias" to "PYTHONMALLOC=debug python3.6 -Wd -bb -X faulthandler script.py". Python has more and more options to enable debug checks at runtime, it's hard to be aware of all of them. My intent is to run tests in "developer mode": if tests pass, you are sure that they will pass in the regular mode since the developer mode only enables more checks at runtime, it shouldn't change the behaviour. It seems like the consensus is more to run Python in "release mode" by default, since it was decided to hide DeprecationWarning by default. I understood that the default mode targets end users. Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Intention to accept PEP 552 soon (deterministic pyc files)
On Oct 4, 2017, at 13:53, Benjamin Peterson wrote: > It might be helpful to enumerate the usecases for such an API. Perhaps a > narrow, specialized API could satisfy most needs in a supportable way. Currently `python -m dis thing.py` compiles the source then disassembles it. It would be kind of cool if you could pass a .pyc file to -m dis, in which case you’d need to unpack the header to get to the code object. A naive implementation would unpack the magic number and refuse to disassemble any files that don’t match whatever that version of Python understands. A more robust (possibly 3rd party) implementation could potentially disassemble a range of magic numbers and formats, and an API to get at the code object and metadata would help. I was thinking about the bytecode hacking that some debuggers do. This API would help them support multiple versions of Python. They could use the API to discover what pyc format was in use, extract the code object, hack the bytecode and possibly rewrite a new PEP 3147 style pyc file with the debugger bytecodes inserted. Third party bytecode optimizers could use the API to unpack multiple versions of pyc files, do their optimizations, and rewrite new files with the proper format. Cheers, -Barry signature.asc Description: Message signed with OpenPGP ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Intention to accept PEP 552 soon (deterministic pyc files)
Honestly I think the API for accessing historic pyc headers should itself also be 3rd party. CPython itself should not bother (backwards compatibility with pyc files has never been a feature). On Thu, Oct 5, 2017 at 6:44 AM, Barry Warsaw wrote: > On Oct 4, 2017, at 13:53, Benjamin Peterson wrote: > > > It might be helpful to enumerate the usecases for such an API. Perhaps a > > narrow, specialized API could satisfy most needs in a supportable way. > > Currently `python -m dis thing.py` compiles the source then disassembles > it. It would be kind of cool if you could pass a .pyc file to -m dis, in > which case you’d need to unpack the header to get to the code object. A > naive implementation would unpack the magic number and refuse to > disassemble any files that don’t match whatever that version of Python > understands. A more robust (possibly 3rd party) implementation could > potentially disassemble a range of magic numbers and formats, and an API to > get at the code object and metadata would help. > > I was thinking about the bytecode hacking that some debuggers do. This > API would help them support multiple versions of Python. They could use > the API to discover what pyc format was in use, extract the code object, > hack the bytecode and possibly rewrite a new PEP 3147 style pyc file with > the debugger bytecodes inserted. > > Third party bytecode optimizers could use the API to unpack multiple > versions of pyc files, do their optimizations, and rewrite new files with > the proper format. > > Cheers, > -Barry > > > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > > -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Reorganize Python categories (Core, Library, ...)?
On Wed, Oct 4, 2017 at 11:52 AM, Victor Stinner wrote: > Hi, > > Python uses a few categories to group bugs (on bugs.python.org) and > NEWS entries (in the Python changelog). List used by the blurb tool: > > #.. section: Security > #.. section: Core and Builtins > #.. section: Library > #.. section: Documentation > #.. section: Tests > #.. section: Build > #.. section: Windows > #.. section: macOS > #.. section: IDLE > #.. section: Tools/Demos > #.. section: C API > > My problem is that almost all changes go into "Library" category. When > I read long changelogs, it's sometimes hard to identify quickly the > context (ex: impacted modules) of a change. > > It's also hard to find open bugs of a specific module on > bugs.python.org, since almost all bugs are in the very generic > "Library" category. Using full text returns "false positives". > > I would prefer to see more specific categories like: > > * Buildbots: only issues specific to buildbots > * Networking: socket, asyncio, asyncore, asynchat modules > * Security: ssl module but also vulnerabilities in any other part of > CPython -- we already added a Security category in NEWS/blurb > * Parallelim: multiprocessing and concurrent.futures modules > > It's hard to find categories generic enough to not only contain a > single item, but not contain too many items neither. Other ideas: > > * XML: xml.doc, xml.etree, xml.parsers, xml.sax modules > * Import machinery: imp and importlib modules > * Typing: abc and typing modules > > The best would be to have a mapping of a module name into a category, > and make sure that all modules have a category. We might try to count > the number of commits and NEWS entries of the last 12 months to decide > if a category has the correct size. > > I don't think that we need a distinct categoy for each module. We can > put many uncommon modules in a generic category. > > By the way, we need maybe also a new "module name" field in the bug > tracker. But then comes the question of normalizing module names. For > example, should "email.message" be normalized to "email"? Maybe store > "email.message" but use "email" for search, display the module in the > issue title, etc. > > Victor Personally I've always dreamed about having *all* module names. That would reflect experts.rst file: https://github.com/python/devguide/blob/master/experts.rst -- Giampaolo - http://grodola.blogspot.com ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Inheritance vs composition in backcompat (PEP521)
On Tue, Oct 3, 2017 at 1:11 AM, Koos Zevenhoven wrote: > On Oct 3, 2017 01:00, "Guido van Rossum" wrote: > > Mon, Oct 2, 2017 at 2:52 PM, Koos Zevenhoven wrote > > I don't mind this (or Nathaniel ;-) being academic. The backwards >> incompatibility issue I've just described applies to any extension via >> composition, if the underlying type/protocol grows new members (like the CM >> protocol would have gained __suspend__ and __resume__ in PEP521). >> > > Since you seem to have a good grasp on this issue, does PEP 550 suffer > from the same problem? (Or PEP 555, for that matter? :-) > > > > Neither has this particular issue, because they don't extend an existing > protocol. If this thread has any significance, it will most likely be > elsewhere. > Actually, I realize I should be more precise with terminology regarding "extending an existing protocol"/"growing new members". Below, I'm still using PEP 521 as an example (sorry). In fact, in some sense, "adding" __suspend__ and __resume__ to context managers *does not* extend the context manager protocol, even though it kind of looks like it does. There would instead be two separate protocols: (A) The traditional PEP 343 context manager: __enter__ __exit__ (B) The hyphothetical PEP 521 context manager: __enter__ __suspend__ __resume__ __exit__ Protocols A and B are incompatible in both directions: * It is generally not safe to use a type-A context manager assuming it implements B. * It is generally not safe to use a type-B context manager assuming it implements A. But if you now have a type-B object, it looks like it's also type-A, especially for code that is not aware of the existence of B. This is where the problems come from: a wrapper for type A does the wrong thing when wrapping a type-B object (except when using inheritance). [Side note: Another interpretation of the situation is that, instead of adding protocol B, A is removed and is replaced with: (C) The hypothetical PEP 521 context manager with optional members: __enter__ __suspend__ (optional) __resume__ (optional) __exit__ But now the same problems just come from the fact that A no longer exists while there is code out there that assumes A. But this is only a useful interpretation if you are the only user of the protocol or if it's otherwise ok to remove A. So let's go back to the A-B interpretation.] Q: Could the problem of protocol conflict be solved? One way to tell A and B apart would be to always explicitly mark the protocol with a base class. Obviously this is not the case with existing uses of context managers. But there's another way, which is to change the naming: (A) The traditional PEP 343 context manager: __enter__ __exit__ (Z) The *modified* hyphothetical PEP 521 context manager: __begin__ __suspend__ __resume__ __end__ Now, A and Z are easy to tell apart. A context manager wrapper designed for type A immediately fails if used to wrap a type-Z object. But of course the whole context manager concept now suddenly became a lot more complicated. It is interesting that, in the A-B scheme, making a general context manager wrapper using inheritance *just works*, even if A is not a subprotocol of B and B is not a subprotocol of A. Anyway, a lot of this is amplified by the fact that the methods of the context manager protocols are not independent functionality. Instead, calling one of them leads to the requirement that the other methods are also called at the right moments. --Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven + ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 553; the interaction between $PYTHONBREAKPOINT and -E
> I don't know what is the best option, but I dislike adding two > options, PYTHONBREAKPOINT and -X nobreakpoint, for the same features. > I would become complicated to know which option has the priority. Just to close the loop, I’ve landed the PEP 553 PR treating PYTHONBREAKPOINT the same as all other environment variables when -E is present. Let’s see how that goes. Thanks all for the great feedback and reviews. Now I’m thinking about putting a backport version on PyPI. :) Cheers, -Barry signature.asc Description: Message signed with OpenPGP ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 554 v3 (new interpreters module)
On Thu, Oct 5, 2017 at 4:57 AM, Nick Coghlan wrote: > This would be hard to get to work reliably, because "orig.tp_share()" would > be running in the receiving interpreter, but all the attributes of "orig" > would have been allocated by the sending interpreter. It gets more reliable > if it's *Channel.send* that calls tp_share() though, but moving the call to > the sending side makes it clear that a tp_share protocol would still need to > rely on a more primitive set of "shareable objects" that were the permitted > return values from the tp_share call. The point of running tp_share() in the receiving interpreter is to force allocation under that interpreter, so that GC applies there. I agree that you basically can't do anything in tp_share() that would affect the sending interpreter, including INCREF and DECREF. Since we INCREFed in send(), we know that the we have a safe reference, so we don't have to worry about that part in tp_share(). We would only be able to do low-level things (like the buffer protocol) that don't interact with the original object's interpreter. Given that this is a quite low-level tp slot and low-level functionality, I'd expect that a sufficiently clear entry (i.e. warning) in the docs would be enough for the few that dare. >From my perspective adding the tp_share slot allows for much more experimentation with object sharing (right now, long before we get to considering how to stop sharing the GIL) by us *and* third parties. None of the alternatives seem to offer the same opportunity while still working out *after* we stop sharing the GIL. > > And that's the real pay-off that comes from defining this in terms of the > memoryview protocol: Py_buffer structs *aren't* Python objects, so it's only > a regular C struct that gets passed across the interpreter boundary (the > reference to the original objects gets carried along passively as part of > the CIV - it never gets *used* in the receiving interpreter). Yeah, the (PEP 3118) buffer protocol offers precedent in a number of ways that are applicable to channels here. I'm simply reticent to lock PEP 554 into such a specific solution as the buffer-specific CIV. I'm trying to accommodate anticipated future needs while keeping the PEP as simple and basic as possible. It's driving me nuts! :P Things were *much* simpler before I added Channels to the PEP. :) > >> >> bytes.tp_share(): >> obj = blank_bytes(len(self)) >> obj.ob_sval = self.ob_sval # hand-wavy memory sharing >> return obj > > > This is effectively reinventing memoryview, while trying to pretend it's an > ordinary bytes object. Don't reinvent memoryview :) > >> >> bytes.tp_free(): # under no-shared-GIL: >> # most of this could be pulled into a macro for re-use >> orig = lookup_shared(self) >> if orig != NULL: >> current = release_LIL() >> interp = lookup_owner(orig) >> acquire_LIL(interp) >> decref(orig) >> release_LIL(interp) >> acquire_LIL(current) >> # clear shared/owner tables >> # clear/release self.ob_sval >> free(self) > > > I don't think we should be touching the behaviour of core builtins solely to > enable message passing to subinterpreters without a shared GIL. Keep in mind that I included the above as a possible solution using tp_share() that would work *after* we stop sharing the GIL. My point is that with tp_share() we have a solution that works now *and* will work later. I don't care how we use tp_share to do so. :) I long to be able to say in the PEP that you can pass bytes through the channel and get bytes on the other side. That said, I'm not sure how this could be made to work without involving tp_free(). If that is really off the table (even in the simplest possible ways) then I don't think there is a way to actually share objects of builtin types between interpreters other than through views like CIV. We could still support tp_share() for the sake of third parties, which would facilitate that simplicity I was aiming for in sending data between interpreters, as well as leaving the door open for nearly all the same experimentation. However, I expect that most *uses* of channels will involve builtin types, particularly as we start off, so having to rely on view types for builtins would add not-insignificant awkwardness to using channels. I'd still like to avoid that if possible, so let's not rush to completely close the door on small modifications to tp_free for builtins. :) Regardless, I still (after a night's rest and a day of not thinking about it) consider tp_share() to be the solution I'd been hoping we'd find, whether or not we can apply it to builtin types. > > The simplest possible variant of CIVs that I can think of would be able to > avoid that outcome by being a memoryview subclass, since they just need to > hold the extra reference to the original interpreter, and include some logic > to swtich interpreters at the appropriate time. > >
[Python-Dev] how/where is open() implemented ?
Hi, I am looking for the implementation of open() in the src, but so far I am not able to do this. >From my observation, the implementation of open() in python2/3 does not employ the open(2) system call. However without open(2) how can one possibly obtain a file descriptor? Yubin ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Intention to accept PEP 552 soon (deterministic pyc files)
On 5 October 2017 at 23:44, Barry Warsaw wrote: > On Oct 4, 2017, at 13:53, Benjamin Peterson wrote: > > > It might be helpful to enumerate the usecases for such an API. Perhaps a > > narrow, specialized API could satisfy most needs in a supportable way. > > Currently `python -m dis thing.py` compiles the source then disassembles > it. It would be kind of cool if you could pass a .pyc file to -m dis, in > which case you’d need to unpack the header to get to the code object. A > naive implementation would unpack the magic number and refuse to > disassemble any files that don’t match whatever that version of Python > understands. A more robust (possibly 3rd party) implementation could > potentially disassemble a range of magic numbers and formats, and an API to > get at the code object and metadata would help. > > I was thinking about the bytecode hacking that some debuggers do. This > API would help them support multiple versions of Python. They could use > the API to discover what pyc format was in use, extract the code object, > hack the bytecode and possibly rewrite a new PEP 3147 style pyc file with > the debugger bytecodes inserted. > > Third party bytecode optimizers could use the API to unpack multiple > versions of pyc files, do their optimizations, and rewrite new files with > the proper format. > Actually doing that properly also requires keeping track of which opcodes were valid in different versions of the eval loop, so as Guido suggests, such an abstraction layer would make the most sense as a third party project that tracked: - the magic number for each CPython feature release (plus the 3.5.3+ anomaly) - the pyc header format for each CPython feature release - the valid opcode set for each CPython feature release - any other version dependent variations (e.g. the expected stack layout for BUILD_MAP changed in Python 3.5, when the evaluation order for dict displays was updated to be key then value, rather than the other way around) Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Reorganize Python categories (Core, Library, ...)?
On 6 October 2017 at 06:35, Giampaolo Rodola' wrote: > On Wed, Oct 4, 2017 at 11:52 AM, Victor Stinner > wrote: > >> By the way, we need maybe also a new "module name" field in the bug >> tracker. But then comes the question of normalizing module names. For >> example, should "email.message" be normalized to "email"? Maybe store >> "email.message" but use "email" for search, display the module in the >> issue title, etc. >> >> Victor > > > Personally I've always dreamed about having *all* module names. That would > reflect experts.rst file: > https://github.com/python/devguide/blob/master/experts.rst > Right. One UX note though, based on similarly long lists in the Bugzilla component fields for Fedora and RHEL: list boxes don't scale well to really long lists of items, so such a field would ideally be based on a combo-box with typeahead support. (We have something like that already for the nosy list, where the typeahead support checks for Experts Index entries) Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] how/where is open() implemented ?
2017-10-05 19:19 GMT-07:00 Yubin Ruan : > Hi, > I am looking for the implementation of open() in the src, but so far I > am not able to do this. > > In Python 3, builtins.open is the same as io.open, which is implemented in the _io_open function in Modules/_io/_iomodule.c. > From my observation, the implementation of open() in python2/3 does > not employ the open(2) system call. However without open(2) how can > one possibly obtain a file descriptor? > There is a call to open() (the C function) in _io_FileIO___init___impl in Modules/_io/fileio.c. I haven't traced through all the code, but I suspect builtins.open ends up calling that. > > Yubin > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > jelle.zijlstra%40gmail.com > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] how/where is open() implemented ?
On 2017-10-06 03:19, Yubin Ruan wrote: Hi, I am looking for the implementation of open() in the src, but so far I am not able to do this. From my observation, the implementation of open() in python2/3 does not employ the open(2) system call. However without open(2) how can one possibly obtain a file descriptor? I think it's somewhere in here: https://github.com/python/cpython/blob/master/Modules/_io/fileio.c ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 554 v3 (new interpreters module)
On 6 October 2017 at 11:48, Eric Snow wrote: > > And that's the real pay-off that comes from defining this in terms of the > > memoryview protocol: Py_buffer structs *aren't* Python objects, so it's > only > > a regular C struct that gets passed across the interpreter boundary (the > > reference to the original objects gets carried along passively as part of > > the CIV - it never gets *used* in the receiving interpreter). > > Yeah, the (PEP 3118) buffer protocol offers precedent in a number of > ways that are applicable to channels here. I'm simply reticent to > lock PEP 554 into such a specific solution as the buffer-specific CIV. > I'm trying to accommodate anticipated future needs while keeping the > PEP as simple and basic as possible. It's driving me nuts! :P Things > were *much* simpler before I added Channels to the PEP. :) > Starting with memory-sharing only doesn't lock us into anything, since you can still add a more flexible kind of channel based on a different protocol later if it turns out that memory sharing isn't enough. By contrast, if you make the initial channel semantics incompatible with multiprocessing by design, you *will* prevent anyone from experimenting with replicating the shared memory based channel API for communicating between processes :) That said, if you'd prefer to keep the "Channel" name available for the possible introduction of object channels at a later date, you could call the initial memoryview based channel a "MemChannel". > > I don't think we should be touching the behaviour of core builtins > solely to > > enable message passing to subinterpreters without a shared GIL. > > Keep in mind that I included the above as a possible solution using > tp_share() that would work *after* we stop sharing the GIL. My point > is that with tp_share() we have a solution that works now *and* will > work later. I don't care how we use tp_share to do so. :) I long to > be able to say in the PEP that you can pass bytes through the channel > and get bytes on the other side. > Memory views are a builtin type as well, and they emphasise the practical benefit we're trying to get relative to typical multiprocessing arranagements: zero-copy data sharing. So here's my proposed experimentation-enabling development strategy: 1. Start out with a MemChannel API, that accepts any buffer-exporting object as input, and outputs only a cross-interpreter memoryview subclass 2. Use that as the basis for the work to get to a per-interpreter locking arrangement that allows subinterpreters to fully exploit multiple CPUs 3. Only then try to design a Channel API that allows for sharing builtin immutable objects between interpreters (bytes, strings, numbers), at a time when you can be certain you won't be inadvertently making it harder to make the GIL a truly per-interpreter lock, rather than the current process global runtime lock. The key benefit of this approach is that we *know* MemChannel can work: the buffer protocol already operates at the level of C structs and pointers, not Python objects, and there are already plenty of interesting buffer-protocol-supporting objects around, so as long as the CIV switches interpreters at the right time, there aren't any fundamentally new runtime level capabilities needed to implement it. The lower level MemChannel API could then also be replicated for multiprocessing, while the higher level more speculative object-based Channel API would be specific to subinterpreters (and probably only ever designed and implemented if you first succeed in making subinterpreters sufficiently independent that they don't rely on a process-wide GIL any more). So I'm not saying "Never design an object-sharing protocol specifically for use with subinterpreters". I'm saying "You don't have a demonstrated need for that yet, so don't try to define it until you do". > My mind is drawn to the comparison between that and the question of > CIV vs. tp_share(). CIV would be more like the post-451 import world, > where I expect the CIV would take care of the data sharing operations. > That said, the situation in PEP 554 is sufficiently different that I'm > not convinced a generic CIV protocol would be better. I'm not sure > how much CIV could do for you over helpers+tp_share. > > Anyway, here are the leading approaches that I'm looking at now: > > * adding a tp_share slot > + you send() the object directly and recv() the object coming out of > tp_share() > (which will probably be the same type as the original) > + this would eventually require small changes in tp_free for > participating types > + we would likely provide helpers (eventually), similar to the new > buffer protocol, > to make it easier to manage sharing data > I'm skeptical about this approach because you'll be designing in a vacuum against future possible constraints that you can't test yet: the inherent complexity in the object sharing protocol will come from *not* having a process-wide GIL,