Re: [Python-Dev] folding cElementTree behind ElementTree in 3.3
Fred Drake, 08.02.2012 05:41: > On Tue, Feb 7, 2012 at 11:31 PM, Eli Bendersky wrote: >> Besides, in >> http://mail.python.org/pipermail/python-dev/2011-December/114812.html >> Stefan Behnel said "[...] Today, ET is *only* being maintained in the >> stdlib by Florent Xicluna [...]". Is this not true? > > I don't know. I took this to be an observation rather than a declaration > of intent by the package owner (Fredrik Lundh). This observation resulted from the fact that Fredrik hasn't updated the code in his public ElementTree repository(ies) since 2009, i.e. way before the release of Python 2.7 and 3.2 that integrated these changes. https://bitbucket.org/effbot/et-2009-provolone/overview The integration of ElementTree 1.3 into the standard library was almost exclusively done by Florent, with some supporting comments by Fredrik. Note that ElementTree 1.3 has not even been officially released yet, so the only "final" public release of it is in the standard library. Since then, Florent has been actively working on bug tickets, most of which have not received any reaction on the side of Fredrik. That makes me consider it the reality that "today, ET is only being maintained in the stdlib". >> P.S. Would declaring that xml.etree is now independently maintained by >> pydev be a bad thing? Why? > > So long as Fredrik owns the package, I think forking it for the standard > library would be a bad thing, though not for technical reasons. Fredrik > provided his libraries for the standard library in good faith, and we still > list him as the external maintainer. Until *that* changes, forking would > be inappropriate. I'd much rather see a discussion with Fredrik about the > future maintenance plan for ElementTree and cElementTree. I didn't get a response from him to my e-mails since early 2010. Maybe others have more luck if they try, but I don't have the impression that waiting another two years gets us anywhere interesting. Given that it was two months ago that I started the "Fixing the XML batteries" thread (and years since I brought up the topic for the first time), it seems to be hard enough already to get anyone on python-dev actually do something for Python's XML support, instead of just actively discouraging those who invest time and work into it. Stefan ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Add a new "locale" codec?
Is the idea to have: b"foo".decode("locale") be roughly equivalent to encoding = locale.getpreferredencoding(False) b"foo".decode(encoding) ? ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] folding cElementTree behind ElementTree in 3.3
>> The facade can be added to xml/etree/ElementTree.py since that's the >> only documented module. It can attempt to do: >> >> from _elementtree import * >> >> (which is what cElementTree.py) does, and on failure, just go on doing >> what it does now. > > Basically, cElementTree (actually the accelerator module) reuses everything > from ElementTree that it does not implement itself, e.g. the serialiser or > the ElementPath implementation in ElementPath.py (which is not commonly > being used by itself anyway). > > ElementInclude is meant to be independently imported by user code and works > with both implementations, although it uses plain ElementTree by default > and currently needs explicit configuring for cElementTree. It looks like > that need would vanish when ElementTree uses the accelerator module > internally. > > So, ElementTree.py is a superset of cElementTree's C module, and importing > that C module into ElementTree.py instead of only importing it into > cElementTree.py would just make ElementTree.py faster, that's basically it. > Yep. Any objections from pydev? Stefan, in the other thread (... XML batteries ) you said you will contact Fredrik, did you manage to get hold of him? Eli ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] folding cElementTree behind ElementTree in 3.3
Eli Bendersky, 08.02.2012 07:07: > On Wed, Feb 8, 2012 at 07:10, Fred Drake wrote: >> On Tue, Feb 7, 2012 at 11:46 PM, Eli Bendersky wrote: >>> The initial proposal of changing *the stdlib >>> import facade* for xml.etree.ElementTree to use the C accelerator >>> (_elementtree) by default. >> >> I guess this is one source of confusion: what are you referring to an >> an "import façade"? When I look in Lib/xml/etree/, I see the ElementTree, >> ElementPath, and ElementInclude modules, and a wrapper for cElementTree's >> extension module. >> >> There isn't any sort of façade for ElementTree; are you proposing to add >> one, perhaps in xml.etree/__init__.py? > > > AFAICS ElementPath is a helper used by ElementTree, and cElementTree > has one of its own. It's not documented for stand-alone use. > ElementInclude also isn't documented and doesn't appear to be used > anywhere. > > The facade can be added to xml/etree/ElementTree.py since that's the > only documented module. It can attempt to do: > > from _elementtree import * > > (which is what cElementTree.py) does, and on failure, just go on doing > what it does now. Basically, cElementTree (actually the accelerator module) reuses everything from ElementTree that it does not implement itself, e.g. the serialiser or the ElementPath implementation in ElementPath.py (which is not commonly being used by itself anyway). ElementInclude is meant to be independently imported by user code and works with both implementations, although it uses plain ElementTree by default and currently needs explicit configuring for cElementTree. It looks like that need would vanish when ElementTree uses the accelerator module internally. So, ElementTree.py is a superset of cElementTree's C module, and importing that C module into ElementTree.py instead of only importing it into cElementTree.py would just make ElementTree.py faster, that's basically it. Stefan ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] folding cElementTree behind ElementTree in 3.3
On Wed, Feb 8, 2012 at 07:10, Fred Drake wrote: > On Tue, Feb 7, 2012 at 11:46 PM, Eli Bendersky wrote: >> The initial proposal of changing *the stdlib >> import facade* for xml.etree.ElementTree to use the C accelerator >> (_elementtree) by default. > > I guess this is one source of confusion: what are you referring to an > an "import façade"? When I look in Lib/xml/etree/, I see the ElementTree, > ElementPath, and ElementInclude modules, and a wrapper for cElementTree's > extension module. > > There isn't any sort of façade for ElementTree; are you proposing to add > one, perhaps in xml.etree/__init__.py? AFAICS ElementPath is a helper used by ElementTree, and cElementTree has one of its own. It's not documented for stand-alone use. ElementInclude also isn't documented and doesn't appear to be used anywhere. The facade can be added to xml/etree/ElementTree.py since that's the only documented module. It can attempt to do: from _elementtree import * (which is what cElementTree.py) does, and on failure, just go on doing what it does now. Eli ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] folding cElementTree behind ElementTree in 3.3
On Tue, Feb 7, 2012 at 11:46 PM, Eli Bendersky wrote: > The initial proposal of changing *the stdlib > import facade* for xml.etree.ElementTree to use the C accelerator > (_elementtree) by default. I guess this is one source of confusion: what are you referring to an an "import façade"? When I look in Lib/xml/etree/, I see the ElementTree, ElementPath, and ElementInclude modules, and a wrapper for cElementTree's extension module. There isn't any sort of façade for ElementTree; are you proposing to add one, perhaps in xml.etree/__init__.py? -Fred -- Fred L. Drake, Jr. "A person who won't read has no advantage over one who can't read." --Samuel Langhorne Clemens ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] folding cElementTree behind ElementTree in 3.3
On Wed, Feb 8, 2012 at 06:41, Fred Drake wrote: > On Tue, Feb 7, 2012 at 11:31 PM, Eli Bendersky wrote: >> Besides, in >> http://mail.python.org/pipermail/python-dev/2011-December/114812.html >> Stefan Behnel said "[...] Today, ET is *only* being maintained in the >> stdlib by Florent Xicluna [...]". Is this not true? > > I don't know. I took this to be an observation rather than a declaration > of intent by the package owner (Fredrik Lundh). > >> P.S. Would declaring that xml.etree is now independently maintained by >> pydev be a bad thing? Why? > > So long as Fredrik owns the package, I think forking it for the standard > library would be a bad thing, though not for technical reasons. Fredrik > provided his libraries for the standard library in good faith, and we still > list him as the external maintainer. Until *that* changes, forking would > be inappropriate. I'd much rather see a discussion with Fredrik about the > future maintenance plan for ElementTree and cElementTree. > Yes, I realize this is a loaded issue and I agree that all steps in this direction should be taken with Fredrik's agreement. However, to re-focus: The initial proposal of changing *the stdlib import facade* for xml.etree.ElementTree to use the C accelerator (_elementtree) by default. Will that somehow harm Fredrik's sovereignty over ET? Are there any other problems hidden here? Because if not, it appears like a change of only a few lines of code could provide a significantly better XML processing experience in 3.3 for a lot of users (and save some keystrokes for the ones who already know to look for cElementTree). Eli ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] folding cElementTree behind ElementTree in 3.3
On Tue, Feb 7, 2012 at 11:31 PM, Eli Bendersky wrote: > Besides, in > http://mail.python.org/pipermail/python-dev/2011-December/114812.html > Stefan Behnel said "[...] Today, ET is *only* being maintained in the > stdlib by Florent Xicluna [...]". Is this not true? I don't know. I took this to be an observation rather than a declaration of intent by the package owner (Fredrik Lundh). > P.S. Would declaring that xml.etree is now independently maintained by > pydev be a bad thing? Why? So long as Fredrik owns the package, I think forking it for the standard library would be a bad thing, though not for technical reasons. Fredrik provided his libraries for the standard library in good faith, and we still list him as the external maintainer. Until *that* changes, forking would be inappropriate. I'd much rather see a discussion with Fredrik about the future maintenance plan for ElementTree and cElementTree. -Fred -- Fred L. Drake, Jr. "A person who won't read has no advantage over one who can't read." --Samuel Langhorne Clemens ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] requirements for moving __import__ over to importlib?
On Tue, Feb 7, 2012 at 8:47 PM, Nick Coghlan wrote: > On Wed, Feb 8, 2012 at 12:54 PM, Terry Reedy wrote: >> On 2/7/2012 9:35 PM, PJ Eby wrote: >>> It's just that not everything I write can depend on Importing. >>> Throw an equivalent into the stdlib, though, and I guess I wouldn't have >>> to worry about dependencies... >> >> And that is what I think (agree?) should be done to counteract the likely >> slowdown from using importlib. > > Yeah, this is one frequently reinvented wheel that could definitely do > with a standard implementation. Christian Heimes made an initial > attempt at such a thing years ago with PEP 369, but an importlib based > __import__ would let the implementation largely be pure Python (with > all the increase in power and flexibility that implies). > > I'm not sure such an addition would help much with the base > interpreter start up time though - most of the modules we bring in are > because we're actually using them for some reason. > > The other thing that shouldn't be underrated here is the value in > making the builtin import system PEP 302 compliant from a > *documentation* perspective. I've made occasional attempts at fully > documenting the import system over the years, and I always end up > giving up because the combination of the pre-PEP 302 builtin > mechanisms in import.c and the PEP 302 compliant mechanisms for things > like zipimport just degenerate into a mess of special cases that are > impossible to justify beyond "nobody got around to fixing this yet". > The fact that we have an undocumented PEP 302 based reimplementation > of imports squirrelled away in pkgutil to make pkgutil and runpy work > is sheer insanity (replacing *that* with importlib might actually be a > good first step towards full integration). +1 on all counts -eric ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] folding cElementTree behind ElementTree in 3.3
On Wed, Feb 8, 2012 at 06:15, Nick Coghlan wrote: > On Wed, Feb 8, 2012 at 1:59 PM, Eli Bendersky wrote: >> Is there a good reason why xml.etree.ElementTree / >> xml.etree.cElementTree did not "receive this treatment"? > > See PEP 360, which lists "Externally Maintained Packages". In the past > we allowed additions to the standard library without requiring that > the standard library version become the master version. These days we > expect python.org to become the master version, perhaps with backports > and experimental features published on PyPI (cf. packaging vs > distutils2, unittest vs unittest, contextlib vs contextlib2). > > ElementTree was one of the last of those externally maintained modules > added to the standard library - as documented in the PEP, it's still > officially maintained by Fredrik Lundh. Folding the two > implementations together in the standard library would mean officially > declaring that xml.etree is now an independently maintained fork of > Fredrik's version rather than just a "snapshot in time" of a > particular version (which is what it has been historically). > > So the reasons for keeping these two separate to date isn't technical, > it's because Fredrik publishes them as separate modules. > The idea is to import the C module when xml.etree.ElementTree is imported, falling back to the Python module if that fails for some reason. So this is not modifying the modules, just the Python stdlib facade for them. Besides, in http://mail.python.org/pipermail/python-dev/2011-December/114812.html Stefan Behnel said "[...] Today, ET is *only* being maintained in the stdlib by Florent Xicluna [...]". Is this not true? Eli P.S. Would declaring that xml.etree is now independently maintained by pydev be a bad thing? Why? ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] folding cElementTree behind ElementTree in 3.3
On Tue, Feb 7, 2012 at 22:15, Nick Coghlan wrote: > Folding the two > implementations together in the standard library would mean officially > declaring that xml.etree is now an independently maintained fork of > Fredrik's version rather than just a "snapshot in time" of a > particular version (which is what it has been historically). Is ElementTree even still maintained externally? I seem to remember Florent going through headaches to get changes into this area, and I can't find an external repository for this code. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] folding cElementTree behind ElementTree in 3.3
On Wed, Feb 8, 2012 at 1:59 PM, Eli Bendersky wrote: > Is there a good reason why xml.etree.ElementTree / > xml.etree.cElementTree did not "receive this treatment"? See PEP 360, which lists "Externally Maintained Packages". In the past we allowed additions to the standard library without requiring that the standard library version become the master version. These days we expect python.org to become the master version, perhaps with backports and experimental features published on PyPI (cf. packaging vs distutils2, unittest vs unittest, contextlib vs contextlib2). ElementTree was one of the last of those externally maintained modules added to the standard library - as documented in the PEP, it's still officially maintained by Fredrik Lundh. Folding the two implementations together in the standard library would mean officially declaring that xml.etree is now an independently maintained fork of Fredrik's version rather than just a "snapshot in time" of a particular version (which is what it has been historically). So the reasons for keeping these two separate to date isn't technical, it's because Fredrik publishes them as separate modules. Regards, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] folding cElementTree behind ElementTree in 3.3
Hello, Here's a note from "What's new in Python 3.0": """A common pattern in Python 2.x is to have one version of a module implemented in pure Python, with an optional accelerated version implemented as a C extension; for example, pickle and cPickle. This places the burden of importing the accelerated version and falling back on the pure Python version on each user of these modules. In Python 3.0, the accelerated versions are considered implementation details of the pure Python versions. Users should always import the standard version, which attempts to import the accelerated version and falls back to the pure Python version. The pickle / cPickle pair received this treatment. The profile module is on the list for 3.1. The StringIO module has been turned into a class in the io module.""" Is there a good reason why xml.etree.ElementTree / xml.etree.cElementTree did not "receive this treatment"? In the case of this module, it's quite unfortunate because: 1. The accelerated module is much faster and memory efficient (see recent benchmarks here: http://bugs.python.org/issue11379), and XML processing is an area where processing matters 2. The accelerated module implements the same API 3. It's very hard to even find out about the existence of the accelerated module. Its sole mention in the docs is this un-emphasized line in http://docs.python.org/dev/py3k/library/xml.etree.elementtree.html: "A C implementation of this API is available as xml.etree.cElementTree." Even to an experienced user who carefully reads the whole documentation it's not easy to notice. For the typical user who just jumps around to functions/methods he's interested in, it's essentially invisible. Eli ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] requirements for moving __import__ over to importlib?
On Wed, Feb 8, 2012 at 12:54 PM, Terry Reedy wrote: > On 2/7/2012 9:35 PM, PJ Eby wrote: >> It's just that not everything I write can depend on Importing. >> Throw an equivalent into the stdlib, though, and I guess I wouldn't have >> to worry about dependencies... > > And that is what I think (agree?) should be done to counteract the likely > slowdown from using importlib. Yeah, this is one frequently reinvented wheel that could definitely do with a standard implementation. Christian Heimes made an initial attempt at such a thing years ago with PEP 369, but an importlib based __import__ would let the implementation largely be pure Python (with all the increase in power and flexibility that implies). I'm not sure such an addition would help much with the base interpreter start up time though - most of the modules we bring in are because we're actually using them for some reason. The other thing that shouldn't be underrated here is the value in making the builtin import system PEP 302 compliant from a *documentation* perspective. I've made occasional attempts at fully documenting the import system over the years, and I always end up giving up because the combination of the pre-PEP 302 builtin mechanisms in import.c and the PEP 302 compliant mechanisms for things like zipimport just degenerate into a mess of special cases that are impossible to justify beyond "nobody got around to fixing this yet". The fact that we have an undocumented PEP 302 based reimplementation of imports squirrelled away in pkgutil to make pkgutil and runpy work is sheer insanity (replacing *that* with importlib might actually be a good first step towards full integration). Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Fixing the XML batteries
>> On one hand I agree that ET should be emphasized since it's the better >> API with a much faster implementation. But I also understand Martin's >> point of view that minidom has its place, so IMHO some sort of >> compromise should be reached. Perhaps we can recommend using ET for >> those not specifically interested in the DOM interface, but for those >> who *are*, minidom is still a good stdlib option (?). > > > If you can, go ahead and write a patch saying something like that. It should > not be hard to come up with something that is a definite improvement. Create > a tracker issue for comment. but don't let it sit forever. > > A tracker issue already exists for this - http://bugs.python.org/issue11379 - I see no reason to open a new one. I will add my opinion there - feel free to do that too. > Since the current policy seems to be to hide C behind Python when there is > both, I assume that finishing the transition here is something just not > gotten around to yet. Open another issue if there is not one. > I will open a separate discussion on this. Eli ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] requirements for moving __import__ over to importlib?
On 2/7/2012 9:35 PM, PJ Eby wrote: On Tue, Feb 7, 2012 at 6:40 PM, Terry Reedy mailto:tjre...@udel.edu>> wrote: importlib could provide a parameterized decorator for functions that are the only consumers of an import. It could operate much like this: def imps(mod): def makewrap(f): def wrapped(*args, **kwds): print('first/only call to wrapper') g = globals() g[mod] = __import__(mod) g[f.__name__] = f f(*args, **kwds) wrapped.__name__ = f.__name__ return wrapped return makewrap @imps('itertools') def ic(): print(itertools.count) ic() ic() # first/only call to wrapper If I were going to rewrite code, I'd just use lazy imports (see http://pypi.python.org/pypi/Importing ). They're even faster than this approach (or using plain import statements), as they have zero per-call function call overhead. My code above and Importing, as I understand it, both delay imports until needed by using a dummy object that gets replaced at first access. (Now that I am reminded, sys.modules is the better place for the dummy objects. I just wanted to show that there is a simple solution (though more specialized) even for existing code.) The cost of delay, which might mean never, is a bit of one-time extra overhead. Both have no extra overhead after the first call. Unless delayed importing is made standard, both require a bit of extra code somewhere. It's just that not everything I write can depend on Importing. Throw an equivalent into the stdlib, though, and I guess I wouldn't have to worry about dependencies... And that is what I think (agree?) should be done to counteract the likely slowdown from using importlib. (To be clearer; I'm talking about the http://peak.telecommunity.com/DevCenter/Importing#lazy-imports feature, which sticks a dummy module subclass instance into sys.modules, whose __gettattribute__ does a reload() of the module, forcing the normal import process to run, after first changing the dummy object's type to something that doesn't have the __getattribute__ any more. This ensures that all accesses after the first one are at normal module attribute access speed. That, and the "whenImported" decorator from Importing would probably be of general stdlib usefulness too.) -- Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] requirements for moving __import__ over to importlib?
On Tue, Feb 7, 2012 at 6:40 PM, Terry Reedy wrote: > importlib could provide a parameterized decorator for functions that are > the only consumers of an import. It could operate much like this: > > def imps(mod): >def makewrap(f): >def wrapped(*args, **kwds): >print('first/only call to wrapper') >g = globals() >g[mod] = __import__(mod) >g[f.__name__] = f >f(*args, **kwds) >wrapped.__name__ = f.__name__ >return wrapped >return makewrap > > @imps('itertools') > def ic(): >print(itertools.count) > > ic() > ic() > # > first/only call to wrapper > > > If I were going to rewrite code, I'd just use lazy imports (see http://pypi.python.org/pypi/Importing ). They're even faster than this approach (or using plain import statements), as they have zero per-call function call overhead. It's just that not everything I write can depend on Importing. Throw an equivalent into the stdlib, though, and I guess I wouldn't have to worry about dependencies... (To be clearer; I'm talking about the http://peak.telecommunity.com/DevCenter/Importing#lazy-imports feature, which sticks a dummy module subclass instance into sys.modules, whose __gettattribute__ does a reload() of the module, forcing the normal import process to run, after first changing the dummy object's type to something that doesn't have the __getattribute__ any more. This ensures that all accesses after the first one are at normal module attribute access speed. That, and the "whenImported" decorator from Importing would probably be of general stdlib usefulness too.) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] requirements for moving __import__ over to importlib?
On Tue, Feb 7, 2012 at 5:24 PM, Brett Cannon wrote: > > On Tue, Feb 7, 2012 at 16:51, PJ Eby wrote: > >> On Tue, Feb 7, 2012 at 3:07 PM, Brett Cannon wrote: >> >>> So, if there is going to be some baseline performance target I need to >>> hit to make people happy I would prefer to know what that (real-world) >>> benchmark is and what the performance target is going to be on a non-debug >>> build. And if people are not worried about the performance then I'm happy >>> with that as well. =) >>> >> >> One thing I'm a bit worried about is repeated imports, especially ones >> that are inside frequently-called functions. In today's versions of >> Python, this is a performance win for "command-line tool platform" systems >> like Mercurial and PEAK, where you want to delay importing as long as >> possible, in case the code that needs the import is never called at all... >> but, if it *is* used, you may still need to use it a lot of times. >> >> When writing that kind of code, I usually just unconditionally import >> inside the function, because the C code check for an already-imported >> module is faster than the Python "if" statement I'd have to clutter up my >> otherwise-clean function with. >> >> So, in addition to the things other people have mentioned as performance >> targets, I'd like to keep the slowdown factor low for this type of scenario >> as well. Specifically, the slowdown shouldn't be so much as to motivate >> lazy importers like Mercurial and PEAK to need to rewrite in-function >> imports to do the already-imported check ourselves. ;-) >> >> (Disclaimer: I haven't actually seen Mercurial's delayed/dynamic import >> code, so I can't say for 100% sure if they'd be affected the same way.) >> > > IOW you want the sys.modules case fast, which I will never be able to > match compared to C code since that is pure execution with no I/O. > Couldn't you just prefix the __import__ function with something like this: ... try: module = sys.modules[name] except KeyError: # slow code path (Admittedly, the import lock is still a problem; initially I thought you could just skip it for this case, but the problem is that another thread could be in the middle of executing the module.) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Add a new "locale" codec?
Hi, I added PyUnicode_DecodeLocale(), PyUnicode_DecodeLocaleAndSize() and PyUnicode_EncodeLocale() to Python 3.3 to fix bugs. I hesitate to expose this codec in Python: it can be useful is some cases, especially if you need to interact with C functions. The glib library has functions using the *current* locale encoding, g_locale_from_utf8() for example. Related issue with more information: http://bugs.python.org/issue13619 Victor ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] requirements for moving __import__ over to importlib?
On 2/7/2012 4:51 PM, PJ Eby wrote: One thing I'm a bit worried about is repeated imports, especially ones that are inside frequently-called functions. In today's versions of Python, this is a performance win for "command-line tool platform" systems like Mercurial and PEAK, where you want to delay importing as long as possible, in case the code that needs the import is never called at all... but, if it *is* used, you may still need to use it a lot of times. When writing that kind of code, I usually just unconditionally import inside the function, because the C code check for an already-imported module is faster than the Python "if" statement I'd have to clutter up my otherwise-clean function with. importlib could provide a parameterized decorator for functions that are the only consumers of an import. It could operate much like this: def imps(mod): def makewrap(f): def wrapped(*args, **kwds): print('first/only call to wrapper') g = globals() g[mod] = __import__(mod) g[f.__name__] = f f(*args, **kwds) wrapped.__name__ = f.__name__ return wrapped return makewrap @imps('itertools') def ic(): print(itertools.count) ic() ic() # first/only call to wrapper -- Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] requirements for moving __import__ over to importlib?
Brett Cannon python.org> writes: > IOW you want the sys.modules case fast, which I will never be able to match compared to C code since that is pure execution with no I/O. > Sure you can: have a really fast Python VM. Constructive: if you can run this code under PyPy it'd be easy to just: $ pypy -mtimeit "import struct" $ pypy -mtimeit -s "import importlib" "importlib.import_module('struct')" Or whatever the right API is. Alex ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] requirements for moving __import__ over to importlib?
On Tue, 7 Feb 2012 17:16:18 -0500 Brett Cannon wrote: > > > > IOW I really do not look forward to someone saying "importlib is so much > > > slower at importing a module containing ``pass``" when (a) that never > > > happens, and (b) most programs do not spend their time importing but > > > instead doing interesting work. > > > > Well, import time is so important that the Mercurial developers have > > written an on-demand import mechanism, to reduce the latency of > > command-line operations. > > > > Sure, but they are a somewhat extreme case. I don't think Mercurial is extreme. Any command-line tool written in Python applies. For example, yum (Fedora's apt-get) is written in Python. And I'm sure many people do small administration scripts in Python. These tools may then be run in a loop by whatever other script. > > But it's not only important for Mercurial and the like. Even if you're > > developing a Web app, making imports slower will make restarts slower, > > and development more tedious in the first place. > > > > > Fine, startup cost from a hard crash I can buy when you are getting 1000 > QPS, but development more tedious? Well, waiting several seconds when reloading a development server is tedious. Anyway, my point was that other cases (than command-line tools) can be negatively impacted by import time. > > > So, if there is going to be some baseline performance target I need to > > hit > > > to make people happy I would prefer to know what that (real-world) > > > benchmark is and what the performance target is going to be on a > > non-debug > > > build. > > > > - No significant slowdown in startup time. > > > > What's significant and measuring what exactly? I mean startup already has a > ton of imports as it is, so this would wash out the point of measuring > practically anything else for anything small. I don't understand your sentence. Yes, startup has a ton of imports and that's why I'm fearing it may be negatively impacted :) ("a ton" being a bit less than 50 currently) > This is why I said I want a > benchmark to target which does actual work since flat-out startup time > measures nothing meaningful but busy work. "Actual work" can be very small in some cases. For example, if you run "hg branch" I'm quite sure it doesn't do a lot of work except importing many modules and then reading a single file in .hg (the one named ".hg/branch" probably, but I'm not a Mercurial dev). In the absence of more "real world" benchmarks, I think the startup benchmarks in the benchmarks repo are a good baseline. That said you could also install my 3.x port of Twisted here: https://bitbucket.org/pitrou/t3k/ and then run e.g. "python3 bin/trial -h". > I would get more out of code > that just stat'ed every file in Lib since at least that did some work. stat()ing files is not really representative of import work. There are many indirections in the import machinery. (actually, even import.c appears quite slower than a bunch of stat() calls would imply) > > - Within 25% of current performance when importing, say, the "struct" > > module (Lib/struct.py) from bytecode. > > > > Why struct? It's such a small module that it isn't really a typical module. Precisely to measure the overhead. Typical module size will vary depending on development style. Some people may prefer writing many small modules. Or they may be using many small libraries, or using libraries that have adoptes such a development style. Measuring the overhead on small modules will make sure we aren't overly confident. > The median file size of Lib is 11K (e.g. tabnanny.py), not 238 bytes (which > is barely past Hello World). And is this just importing struct or is this > from startup, e.g. ``python -c "import struct"``? Just importing struct, as with the timeit snippets in the other thread. Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] requirements for moving __import__ over to importlib?
On Feb 07, 2012, at 09:19 PM, Paul Moore wrote: >One question here, I guess - does the importlib integration do >anything to make writing on-demand import mechanisms easier (I'd >suspect not, but you never know...) If it did, then performance issues >might be somewhat less of a sticking point, as usual depending on use >cases. It might even be a feature-win if a standard on-demand import mechanism could be added on top of importlib so all these projects wouldn't have to roll their own. -Barry ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] requirements for moving __import__ over to importlib?
On Tue, 7 Feb 2012 17:24:21 -0500 Brett Cannon wrote: > > IOW you want the sys.modules case fast, which I will never be able to match > compared to C code since that is pure execution with no I/O. Why wouldn't continue using C code for that? It's trivial (just a dict lookup). Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] which C language standard CPython must conform to
On Tue, Feb 7, 2012 at 1:41 PM, "Martin v. Löwis" wrote: > Am 07.02.2012 20:10, schrieb Gregory P. Smith: >> Why do we still care about C89? It is 2012 and we're talking about >> Python 3. What compiler on what platform that anyone actually cares >> about does not support C99? > > As Amaury says: Visual Studio still doesn't support C99. The story is > both funny and sad: In Visual Studio 2002, the release notes included > a comment that they couldn't consider C99 (in 2002), because of lack of > time, and the standard came so quickly. In 2003, they kept this notice. > In VS 2005 (IIRC), they said that there is too little customer demand > for C99 so that they didn't implement it; they recommended to use C++ > or C#, anyway. Now C2011 has been published. Thanks! I've probably asked this question before. Maybe I'll learn this time. ;) Some quick searching shows that there is at least hope Microsoft is on board with C++11x (not so surprising, their crown jewels are written in C++). We should at some point demand a C++ compiler for CPython and pick of subset of C++ features to allow use of but that is likely reserved for the Python 4 timeframe (a topic for another thread and time entirely, it isn't feasible for today's codebase). In that timeframe another alternative Question may make sense to ask: Do we need a single unified all-platform-from-one-codebase python interpreter? If we can get other VM implementations up to date language feature wise and manage to sufficiently decouple standard library development from CPython itself that becomes possibile. One of the difficulties with that would obviously be new language feature development if it meant updating more than one VM at a time in order to ship an implementation of a new pep. -gps ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] requirements for moving __import__ over to importlib?
On Tue, Feb 7, 2012 at 16:51, PJ Eby wrote: > On Tue, Feb 7, 2012 at 3:07 PM, Brett Cannon wrote: > >> So, if there is going to be some baseline performance target I need to >> hit to make people happy I would prefer to know what that (real-world) >> benchmark is and what the performance target is going to be on a non-debug >> build. And if people are not worried about the performance then I'm happy >> with that as well. =) >> > > One thing I'm a bit worried about is repeated imports, especially ones > that are inside frequently-called functions. In today's versions of > Python, this is a performance win for "command-line tool platform" systems > like Mercurial and PEAK, where you want to delay importing as long as > possible, in case the code that needs the import is never called at all... > but, if it *is* used, you may still need to use it a lot of times. > > When writing that kind of code, I usually just unconditionally import > inside the function, because the C code check for an already-imported > module is faster than the Python "if" statement I'd have to clutter up my > otherwise-clean function with. > > So, in addition to the things other people have mentioned as performance > targets, I'd like to keep the slowdown factor low for this type of scenario > as well. Specifically, the slowdown shouldn't be so much as to motivate > lazy importers like Mercurial and PEAK to need to rewrite in-function > imports to do the already-imported check ourselves. ;-) > > (Disclaimer: I haven't actually seen Mercurial's delayed/dynamic import > code, so I can't say for 100% sure if they'd be affected the same way.) > IOW you want the sys.modules case fast, which I will never be able to match compared to C code since that is pure execution with no I/O. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] requirements for moving __import__ over to importlib?
On Tue, Feb 7, 2012 at 15:28, Dirkjan Ochtman wrote: > On Tue, Feb 7, 2012 at 21:24, Barry Warsaw wrote: > > Identifying the use cases are important here. For example, even if it > were a > > lot slower, Mailman wouldn't care (*I* might care because it takes > longer to > > run my test, but my users wouldn't). But Bazaar or Mercurial users > would care > > a lot. > > Yeah, startup performance getting worse kinda sucks for command-line > apps. And IIRC it's been getting worse over the past few releases... > > Anyway, I think there was enough of a python3 port for Mercurial (from > various GSoC students) that you can probably run some of the very > simple commands (like hg parents or hg id), which should be enough for > your purposes, right? > Possibly. Where is the code? ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] requirements for moving __import__ over to importlib?
On Tue, Feb 7, 2012 at 16:19, Paul Moore wrote: > On 7 February 2012 20:49, Antoine Pitrou wrote: > > Well, import time is so important that the Mercurial developers have > > written an on-demand import mechanism, to reduce the latency of > > command-line operations. > > One question here, I guess - does the importlib integration do > anything to make writing on-demand import mechanisms easier (I'd > suspect not, but you never know...) If it did, then performance issues > might be somewhat less of a sticking point, as usual depending on use > cases. Depends on what your feature set is. I have a fully working mixin you can add to any loader which makes it lazy if you trigger the import on reading an attribute from the module: http://code.google.com/p/importers/source/browse/importers/lazy.py . But if you want to trigger the import on *writing* an attribute then I have yet to make that work in Python source (maybe people have an idea on how to make that work since __setattr__ doesn't mix well with __getattribute__). ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] requirements for moving __import__ over to importlib?
On Tue, Feb 7, 2012 at 15:24, Barry Warsaw wrote: > Brett, thanks for persevering on importlib! Given how complicated imports > are > in Python, I really appreciate you pushing this forward. I've been knee > deep > in both import.c and importlib at various times. ;) > > On Feb 07, 2012, at 03:07 PM, Brett Cannon wrote: > > >One is maintainability. Antoine mentioned how if change occurs everyone is > >going to have to be able to fix code in importlib, and that's the point! > I > >don't know about the rest of you but I find Python code easier to work > with > >than C code (and if you don't you might be subscribed to the wrong mailing > >list =). I would assume the ability to make changes or to fix bugs will be > >a lot easier with importlib than import.c. So maintainability should be > >easier when it comes to imports. > > I think it's *really* critical that importlib be well-documented. Not just > its API, but also design documents (what classes are there, and why it's > decomposed that way), descriptions of how to extend and subclass, maybe > even > examples for doing some typical hooks. Maybe even a guided tour or > tutorial > for people digging into importlib for the first time. > That's fine and not difficult to do. > > >So, that is the positives. What are the negatives? Performance, of course. > > That's okay. Get it complete, right, and usable first and then unleash the > Pythonic hoards to bang on performance. > > >IOW I really do not look forward to someone saying "importlib is so much > >slower at importing a module containing ``pass``" when (a) that never > >happens, and (b) most programs do not spend their time importing but > >instead doing interesting work. > > Identifying the use cases are important here. For example, even if it > were a > lot slower, Mailman wouldn't care (*I* might care because it takes longer > to > run my test, but my users wouldn't). But Bazaar or Mercurial users would > care > a lot. > Right, which is why I'm looking for some agreed upon, concrete benchmark I can use which isn't fluff. -Brett > > -Barry > ___ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/brett%40python.org > ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] requirements for moving __import__ over to importlib?
On Tue, Feb 7, 2012 at 15:49, Antoine Pitrou wrote: > On Tue, 7 Feb 2012 15:07:24 -0500 > Brett Cannon wrote: > > > > Now I'm going to be upfront and say I really did not want to have this > > performance conversation now as I have done *NO* profiling or analysis of > > the algorithms used in importlib in order to tune performance (e.g. the > > function that handles case-sensitivity, which is on the critical path for > > importing source code, has a platform check which could go away if I > > instead had platform-specific versions of the function that were assigned > > to a global variable at startup). > > >From a cursory look, I think you're gonna have to break (special-case) > some abstractions and have some inner loop coded in C for the common > cases. > Wouldn't shock me if it came to that, but obviously I would like to try to avoid it. > > That said, I think profiling and solving performance issues is critical > *before* integrating this work. It doesn't need to be done by you, but > the python-dev community shouldn't feel strong-armed to solve the issue. > > That part of the discussion I'm staying out of since I want to see this in so I'm biased. > > IOW I really do not look forward to someone saying "importlib is so much > > slower at importing a module containing ``pass``" when (a) that never > > happens, and (b) most programs do not spend their time importing but > > instead doing interesting work. > > Well, import time is so important that the Mercurial developers have > written an on-demand import mechanism, to reduce the latency of > command-line operations. > Sure, but they are a somewhat extreme case. > > But it's not only important for Mercurial and the like. Even if you're > developing a Web app, making imports slower will make restarts slower, > and development more tedious in the first place. > > Fine, startup cost from a hard crash I can buy when you are getting 1000 QPS, but development more tedious? > > So, if there is going to be some baseline performance target I need to > hit > > to make people happy I would prefer to know what that (real-world) > > benchmark is and what the performance target is going to be on a > non-debug > > build. > > - No significant slowdown in startup time. > What's significant and measuring what exactly? I mean startup already has a ton of imports as it is, so this would wash out the point of measuring practically anything else for anything small. This is why I said I want a benchmark to target which does actual work since flat-out startup time measures nothing meaningful but busy work. I would get more out of code that just stat'ed every file in Lib since at least that did some work. > > - Within 25% of current performance when importing, say, the "struct" > module (Lib/struct.py) from bytecode. > Why struct? It's such a small module that it isn't really a typical module. The median file size of Lib is 11K (e.g. tabnanny.py), not 238 bytes (which is barely past Hello World). And is this just importing struct or is this from startup, e.g. ``python -c "import struct"``? ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Is this safe enough? Re: [Python-checkins] cpython: _Py_Identifier are always ASCII strings
> I'd rather restore support for allowing UTF-8 source here (I don't think > that requiring ASCII really improves much), than rename the macro. Done, I reverted my change. Victor ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] requirements for moving __import__ over to importlib?
On Tue, Feb 7, 2012 at 3:07 PM, Brett Cannon wrote: > So, if there is going to be some baseline performance target I need to hit > to make people happy I would prefer to know what that (real-world) > benchmark is and what the performance target is going to be on a non-debug > build. And if people are not worried about the performance then I'm happy > with that as well. =) > One thing I'm a bit worried about is repeated imports, especially ones that are inside frequently-called functions. In today's versions of Python, this is a performance win for "command-line tool platform" systems like Mercurial and PEAK, where you want to delay importing as long as possible, in case the code that needs the import is never called at all... but, if it *is* used, you may still need to use it a lot of times. When writing that kind of code, I usually just unconditionally import inside the function, because the C code check for an already-imported module is faster than the Python "if" statement I'd have to clutter up my otherwise-clean function with. So, in addition to the things other people have mentioned as performance targets, I'd like to keep the slowdown factor low for this type of scenario as well. Specifically, the slowdown shouldn't be so much as to motivate lazy importers like Mercurial and PEAK to need to rewrite in-function imports to do the already-imported check ourselves. ;-) (Disclaimer: I haven't actually seen Mercurial's delayed/dynamic import code, so I can't say for 100% sure if they'd be affected the same way.) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Is this safe enough? Re: [Python-checkins] cpython: _Py_Identifier are always ASCII strings
Am 07.02.2012 20:10, schrieb Gregory P. Smith: > Why do we still care about C89? It is 2012 and we're talking about > Python 3. What compiler on what platform that anyone actually cares > about does not support C99? As Amaury says: Visual Studio still doesn't support C99. The story is both funny and sad: In Visual Studio 2002, the release notes included a comment that they couldn't consider C99 (in 2002), because of lack of time, and the standard came so quickly. In 2003, they kept this notice. In VS 2005 (IIRC), they said that there is too little customer demand for C99 so that they didn't implement it; they recommended to use C++ or C#, anyway. Now C2011 has been published. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Is this safe enough? Re: [Python-checkins] cpython: _Py_Identifier are always ASCII strings
> Does C99 specify the encoding? Can we expect UTF-8? No, it's implementation-defined. However, that really doesn't matter much for the macro (it does matter for the Mercurial repository): The files on disk are mapped, in an implementation-defined manner, into the source character set. All processing is done there, including any stringification. Then, for string literals, the source character set is converted into the execution character set. So for the definition of the _Py_identifier macro, it really matters what the run-time encoding of the stringified identifiers is. > Python is supposed to work on many platforms ans so support a lot of > compilers, not only compilers supporting non-ASCII identifiers. And your point is? Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] requirements for moving __import__ over to importlib?
On 7 February 2012 20:49, Antoine Pitrou wrote: > Well, import time is so important that the Mercurial developers have > written an on-demand import mechanism, to reduce the latency of > command-line operations. One question here, I guess - does the importlib integration do anything to make writing on-demand import mechanisms easier (I'd suspect not, but you never know...) If it did, then performance issues might be somewhat less of a sticking point, as usual depending on use cases. Paul. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] requirements for moving __import__ over to importlib?
On Tue, 7 Feb 2012 15:07:24 -0500 Brett Cannon wrote: > > Now I'm going to be upfront and say I really did not want to have this > performance conversation now as I have done *NO* profiling or analysis of > the algorithms used in importlib in order to tune performance (e.g. the > function that handles case-sensitivity, which is on the critical path for > importing source code, has a platform check which could go away if I > instead had platform-specific versions of the function that were assigned > to a global variable at startup). >From a cursory look, I think you're gonna have to break (special-case) some abstractions and have some inner loop coded in C for the common cases. That said, I think profiling and solving performance issues is critical *before* integrating this work. It doesn't need to be done by you, but the python-dev community shouldn't feel strong-armed to solve the issue. > IOW I really do not look forward to someone saying "importlib is so much > slower at importing a module containing ``pass``" when (a) that never > happens, and (b) most programs do not spend their time importing but > instead doing interesting work. Well, import time is so important that the Mercurial developers have written an on-demand import mechanism, to reduce the latency of command-line operations. But it's not only important for Mercurial and the like. Even if you're developing a Web app, making imports slower will make restarts slower, and development more tedious in the first place. > So, if there is going to be some baseline performance target I need to hit > to make people happy I would prefer to know what that (real-world) > benchmark is and what the performance target is going to be on a non-debug > build. - No significant slowdown in startup time. - Within 25% of current performance when importing, say, the "struct" module (Lib/struct.py) from bytecode. Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] requirements for moving __import__ over to importlib?
On Tue, Feb 7, 2012 at 21:24, Barry Warsaw wrote: > Identifying the use cases are important here. For example, even if it were a > lot slower, Mailman wouldn't care (*I* might care because it takes longer to > run my test, but my users wouldn't). But Bazaar or Mercurial users would care > a lot. Yeah, startup performance getting worse kinda sucks for command-line apps. And IIRC it's been getting worse over the past few releases... Anyway, I think there was enough of a python3 port for Mercurial (from various GSoC students) that you can probably run some of the very simple commands (like hg parents or hg id), which should be enough for your purposes, right? Cheers, Dirkjan ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] requirements for moving __import__ over to importlib?
Brett, thanks for persevering on importlib! Given how complicated imports are in Python, I really appreciate you pushing this forward. I've been knee deep in both import.c and importlib at various times. ;) On Feb 07, 2012, at 03:07 PM, Brett Cannon wrote: >One is maintainability. Antoine mentioned how if change occurs everyone is >going to have to be able to fix code in importlib, and that's the point! I >don't know about the rest of you but I find Python code easier to work with >than C code (and if you don't you might be subscribed to the wrong mailing >list =). I would assume the ability to make changes or to fix bugs will be >a lot easier with importlib than import.c. So maintainability should be >easier when it comes to imports. I think it's *really* critical that importlib be well-documented. Not just its API, but also design documents (what classes are there, and why it's decomposed that way), descriptions of how to extend and subclass, maybe even examples for doing some typical hooks. Maybe even a guided tour or tutorial for people digging into importlib for the first time. >So, that is the positives. What are the negatives? Performance, of course. That's okay. Get it complete, right, and usable first and then unleash the Pythonic hoards to bang on performance. >IOW I really do not look forward to someone saying "importlib is so much >slower at importing a module containing ``pass``" when (a) that never >happens, and (b) most programs do not spend their time importing but >instead doing interesting work. Identifying the use cases are important here. For example, even if it were a lot slower, Mailman wouldn't care (*I* might care because it takes longer to run my test, but my users wouldn't). But Bazaar or Mercurial users would care a lot. -Barry ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] requirements for moving __import__ over to importlib?
I'm going to start this off with the caveat that hg.python.org/sandbox/bcannon#bootstrap_importlib is not completely at feature parity, but getting there shouldn't be hard. There is a FAILING file that has a list of the tests that are not passing because importlib bootstrapping and a comment as to why (I think) they are failing. But no switch would ever happen until the test suite passes. Anyway, to start this conversation I'm going to open with why I think removing most of the C code in Python/import.c and replacing it with importlib/_bootstrap.py is a positive thing. One is maintainability. Antoine mentioned how if change occurs everyone is going to have to be able to fix code in importlib, and that's the point! I don't know about the rest of you but I find Python code easier to work with than C code (and if you don't you might be subscribed to the wrong mailing list =). I would assume the ability to make changes or to fix bugs will be a lot easier with importlib than import.c. So maintainability should be easier when it comes to imports. Two is APIs. PEP 302 introduced this idea of an API for objects that can perform imports so that people can control it, enhance it, introspect it, etc. But as it stands right now, import.c implements none of PEP 302 for any built-in import mechanism. This mostly stems from positive thing #1 I just mentioned. but since I was able to do this code from scratch I was able to design for (and extend) PEP 302 compliance in order to make sure the entire import system was exposed cleanly. This means it is much easier now to write a custom importer for quirky syntax, a different storage mechanism, etc. Third is multi-VM support. IronPython, Jython, and PyPy have all said they would love importlib to become the default import implementation so that all VMs have the same implementation. Some people have even said they will use importlib regardless of what CPython does simply to ease their coding burden, but obviously that still leads to the possibility of subtle semantic differences that would go away if all VMs used the same implementation. So switching would lead to one less possible semantic difference between the various VMs. So, that is the positives. What are the negatives? Performance, of course. Now I'm going to be upfront and say I really did not want to have this performance conversation now as I have done *NO* profiling or analysis of the algorithms used in importlib in order to tune performance (e.g. the function that handles case-sensitivity, which is on the critical path for importing source code, has a platform check which could go away if I instead had platform-specific versions of the function that were assigned to a global variable at startup). I also know that people have a bad habit of latching on to micro-benchmark numbers, especially for something like import which involves startup or can easily be measured. I mean I wrote importlib.test.benchmark to help measure performance changes in any algorithmic changes I might make, but it isn't a real-world benchmark like what Unladen Swallow gave us (e.g. the two start-up benchmarks that use real-world apps -- hg and bzr -- aren't available on Python 3 so only normal_startup and nosite_startup can be used ATM). IOW I really do not look forward to someone saying "importlib is so much slower at importing a module containing ``pass``" when (a) that never happens, and (b) most programs do not spend their time importing but instead doing interesting work. For instance, right now importlib does ``python -c "import decimal"`` (which, BTW, is the largest module in the stdlib) 25% slower on my machine with a pydebug build (a non-debug build would probably be in my favor as I have more Python objects being used in importlib and thus more sanity checks). But if you do something (very) slightly more interesting like ``python -m calendar`` where is a slight amount of work then importlib is currently only 16% slower. So it all depends on how we measure (as usual). So, if there is going to be some baseline performance target I need to hit to make people happy I would prefer to know what that (real-world) benchmark is and what the performance target is going to be on a non-debug build. And if people are not worried about the performance then I'm happy with that as well. =) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Is this safe enough? Re: [Python-checkins] cpython: _Py_Identifier are always ASCII strings
2012/2/7 Gregory P. Smith > Why do we still care about C89? It is 2012 and we're talking about > Python 3. What compiler on what platform that anyone actually cares > about does not support C99? > The Microsoft compilers on Windows do not support C99: - Declarations must be at the start of a block - No designated initializers for structures - Ascii-only identifiers: http://msdn.microsoft.com/en-us/library/e7f8y25b.aspx -- Amaury Forgeot d'Arc ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Is this safe enough? Re: [Python-checkins] cpython: _Py_Identifier are always ASCII strings
Why do we still care about C89? It is 2012 and we're talking about Python 3. What compiler on what platform that anyone actually cares about does not support C99? -gps ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] importlib quest
On Mon, Feb 6, 2012 at 14:49, Antoine Pitrou wrote: > On Mon, 6 Feb 2012 09:57:56 -0500 > Brett Cannon wrote: > > Thanks for any help people can provide me on this now 5 year quest to get > > this work finished. > > Do you have any plan to solve the performance issue? > I have not even looked at performance or attempted to profile the code, so I suspect there is room for improvement. > > $ ./python -m timeit -s "import sys; mod='struct'" \ > "__import__(mod); del sys.modules[mod]" > 1 loops, best of 3: 75.3 usec per loop > $ ./python -m timeit -s "import sys; mod='struct'; from importlib import > __import__" \ > "__import__(mod); del sys.modules[mod]" > 1000 loops, best of 3: 421 usec per loop > > Startup time is already much worse in 3.3 than in 2.7. With such a > slowdown in importing fresh modules, applications using many batteries > (third-party or not) will be heavily impacted. > I have a benchmark suite for importing modules directly at importlib.test.benchmark, but it doesn't explicitly cover searching far down sys.path. I will see if any of the existing tests implicitly do that and if not add it. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] [Python-ideas] matrix operations on dict :)
On Mon, Feb 6, 2012 at 6:12 PM, Steven D'Aprano wrote: > On Mon, Feb 06, 2012 at 09:01:29PM +0100, julien tayon wrote: > > Hello, > > > > Proposing vector operations on dict, and acknowledging there was an > > homeomorphism from rooted n-ary trees to dict, was inducing the > > possibility of making matrix of dict / trees. > > This seems interesting to me, but I don't see that they are important > enough to be built-in to dicts. [...] > > > Otherwise, this looks rather like a library of functions looking for a > use. It might help if you demonstrate what concrete problems this helps > you solve. > > I have the problem looking for this solution! The application for this functionality is in coding a fractal graph (or "multigraph" in the literature). This is the most powerful structure that Computer Science has ever conceived. If you look at the evolution of data structures in compsci, the fractal graph is the ultimate. From lists to trees to graphs to multigraphs. The latter elements can always encompass the former with only O(1) extra cost. It has the potential to encode *any* relationship from the very small to the very large (as well as across or *laterally*) in one unified structure. Optimize this one data structure and the whole standard library could be refactored and simplified by an order of magnitude. Not only that, it will pave the way for the "re-factored" internet that's being worked on which creates a content-centric Internet beyond the graph-level, hypertext internet. Believe, it will be awesome. Slowing down mark ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Is this safe enough? Re: [Python-checkins] cpython: _Py_Identifier are always ASCII strings
2012/2/7 "Martin v. Löwis" : >> _Py_IDENTIFIER(xxx) defines a variable called PyId_xxx, so xxx can >> only be ASCII: the C language doesn't accept non-ASCII identifiers. > > That's not exactly true. In C89, source code is in the "source character > set", which is implementation-defined, except that it must contain > the "basic character set". I believe that it allows for > implementation-defined characters in identifiers. Hum, I hope that these C89 compilers use UTF-8. > In C99, this is > extended to include "universal character names" (\u escapes). They may > appear in identifiers > as long as the characters named are listed in annex D.59 (which I cannot > locate). Does C99 specify the encoding? Can we expect UTF-8? Python is supposed to work on many platforms ans so support a lot of compilers, not only compilers supporting non-ASCII identifiers. Victor ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Is this safe enough? Re: [Python-checkins] cpython: _Py_Identifier are always ASCII strings
> _Py_IDENTIFIER(xxx) defines a variable called PyId_xxx, so xxx can > only be ASCII: the C language doesn't accept non-ASCII identifiers. That's not exactly true. In C89, source code is in the "source character set", which is implementation-defined, except that it must contain the "basic character set". I believe that it allows for implementation-defined characters in identifiers. In C99, this is extended to include "universal character names" (\u escapes). They may appear in identifiers as long as the characters named are listed in annex D.59 (which I cannot locate). In C 2011, annexes D.1 and D.2 specify the characters that you can use in an identifier: D.1 Ranges of characters allowed 1. 00A8, 00AA, 00AD, 00AF, 00B2−00B5, 00B7−00BA, 00BC−00BE, 00C0−00D6, 00D8−00F6, 00F8−00FF 2. 0100−167F, 1681−180D, 180F−1FFF 3. 200B−200D, 202A−202E, 203F−2040, 2054, 2060−206F 4. 2070−218F, 2460−24FF, 2776−2793, 2C00−2DFF, 2E80−2FFF 5. 3004−3007, 3021−302F, 3031−303F 6. 3040−D7FF 7. F900−FD3D, FD40−FDCF, FDF0−FE44, FE47−FFFD 8. 1−1FFFD, 2−2FFFD, 3−3FFFD, 4−4FFFD, 5−5FFFD, 6−6FFFD, 7−7FFFD, 8−8FFFD, 9−9FFFD, A−AFFFD, B−BFFFD, C−CFFFD, D−DFFFD, E−EFFFD D.2 Ranges of characters disallowed initially 1. 0300−036F, 1DC0−1DFF, 20D0−20FF, FE20−FE2F Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com