Re: [Python-Dev] PEP 3333: wsgi_string() function
On Sun, Jan 9, 2011 at 1:47 AM, Stephen J. Turnbull step...@xemacs.orgwrote: Robert Brewer writes: Python 3.1 was released June 27th, 2009. We're coming up faster on the two-year period than we seem to be on a revised WSGI spec. Maybe we should shoot for a bytes of a known encoding type first. You have one. It's called ISO 2022: Information processing -- ISO 7-bit and 8-bit coded character sets -- Code extension techniques. The popularity of that standard speaks for itself. The kind of object PJE was referring to is more like Ruby's strings, which do not embed the encoding inside the bytes themselves but have the encoding as a kind of annotation on the bytes, and do lazy transcoding when combining strings of different encodings. The goal with respect to WSGI is that you could annotate bytes with an encoding but also change or fix that encoding if other out-of-band information implied that you got the encoding wrong (e.g., some data is submitted with the encoding of the page the browser was on, and so nothing inside the request itself will indicate the encoding of the data). Latin1 is kind of the poor man's version of this -- it's a good guess at an encoding, that at worst requires transcoding that can be done in a predictable way. (Personally I think Latin1 gets us 99% of the way there, and so bytes-of-a-known-encoding are not really that important to the WSGI case.) Ian ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Continuing 2.x
On Fri, Oct 29, 2010 at 12:21 PM, Barry Warsaw ba...@python.org wrote: On Oct 29, 2010, at 12:43 PM, Casey Duncan wrote: I like Python 3, I am using it for my latest projects, but I am also keeping Python 2 compatibility. This incurs some overhead, and basically means I am still really only using Python 2 features. So in some respects, my Python 3.x support is only tacit, it works as well as for Python 2, but it's not taking advantage of Python 3 really. I haven't run into a situation yet where I really want to or have to use Python 3 exclusive features, but then again I'm not really learning to use Python 3 either, short of the new C api. One thing that *might* be interesting to explore for Python 3.3 would be something like `python3 --1` or some such switch that would help Python 2 code run more easily in Python 3. This might be a hook to 2to3 or other internal changes that help some of the trickier bits of writing cross-compatible code. More useful IMHO would be things like from __past__ import print_statement, still requiring some annotation of code to make it run, but less invasive than translating code itself. There's still major things you can't handle like that, but if something is syntactically acceptable in both Python 2 and 3 then it's a lot easier to apply simple conditionals around semantics. This would remove the need, for example, for people to use sys.exc_info() to avoid using except Exception as e. -- Ian Bicking | http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Continuing 2.x
On Thu, Oct 28, 2010 at 9:04 AM, Barry Warsaw ba...@python.org wrote: Who is the target audience for a Python 2.8? What exactly would a Python 2.8 accomplish? If Python 2.8 doesn't include new features, well, then what's the point? Python 2.7 will be bug fix maintained for a long time, longer in fact than previous Python 2 versions. So a no-feature Python 2.8 can't be about improving Python 2 stability over time (i.e. just fix the bug in Python 2.7). If Python 2.8 is about adding new features, then it has to be about backporting those features from Python 3. Adding new feature only to a Python 2.8 *isn't* Python, it's a fork of Python. Thinking about language features and core type this seems reasonable, but with the standard library this seems less reasonable -- there's lots of conservative changes to the standard library which aren't bug fixes, and the more the standard library is out of sync between Python 2 and 3 the harder maintaining software that works across those versions becomes. Though one opportunity is to distribute modules from the standard library under new names (e.g., unittest2), and at least in Python 2 you don't have to do anything fancy or worry about the standard library has catching up to the standard library forked module. Library installers seem particularly apropos to this discussion, as everyone seems excited to get them into the standard library and distributed with Python, but with the current plan that will never happen with Python 2. -- Ian Bicking | http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Supporting raw bytes data in urllib.parse.* (was Re: Polymorphic best practices)
On Mon, Sep 20, 2010 at 6:19 PM, Nick Coghlan ncogh...@gmail.com wrote: What are the cases you believe will cause new mojibake? Calling operations like urlsplit on byte sequences in non-ASCII compatible encodings and operations like urljoin on byte sequences that are encoded with different encodings. These errors differ from the URL escaping errors you cite, since they can produce true mojibake (i.e. a byte sequence without a single consistent encoding), rather than merely non-compliant URLs. However, if someone has let their encodings get that badly out of whack in URL manipulation they're probably doomed anyway... FWIW, while I understand the problems non-ASCII-compatible encodings can create, I've never encountered them, perhaps because ASCII-compatible encodings are so dominant. There are ways you can get a URL (HTTP specifically) where there is no notion of Unicode. I think the use case everyone has in mind here is where you get a URL from one of these sources, and you want to handle it. I have a hard time imagining the sequence of events that would lead to mojibake. Naive parsing of a document in bytes couldn't do it, because if you have a non-ASCII-compatible document your ASCII-based parsing will also fail (e.g., looking for b'href=(.*?)'). I suppose if you did urlparse.urlsplit(user_input.encode(sys.getdefaultencoding())) you could end up with the problem. All this is unrelated to the question, though -- a separate byte-oriented function won't help any case I can think of. If the programmer is implementing something like urlparse.urlsplit(user_input.encode(sys.getdefaultencoding())), it's because they *want* to get bytes out. So if it's named urlparse.urlsplit_bytes() they'll just use that, with the same corruption. Since bytes and text don't interact well, the choice of bytes in and bytes out will be a deliberate one. *Or*, bytes will unintentionally come through, but that will just delay the error a while when the bytes out don't work (e.g., urlparse.urljoin(text_url, urlparse.urlsplit(byte_url).path). Delaying the error is a little annoying, but a delayed error doesn't lead to mojibake. Mojibake is caused by allowing bytes and text to intermix, and the polymorphic functions as proposed don't add new dangers in that regard. -- Ian Bicking | http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Web-SIG] Backup plan: WSGI 1 Addenda and wsgiref update for Py3
On Tue, Sep 21, 2010 at 12:47 PM, Chris McDonough chr...@plope.com wrote: On Tue, 2010-09-21 at 12:09 -0400, P.J. Eby wrote: While the Web-SIG is trying to hash out PEP 444, I thought it would be a good idea to have a backup plan that would allow the Python 3 stdlib to move forward, without needing a major new spec to settle out implementation questions. If a WSGI-1-compatible protocol seems more sensible to folks, I'm personally happy to defer discussion on PEP 444 or any other backwards-incompatible proposal. I think both make sense, making WSGI 1 sensible for Python 3 (as well as other small errata like the size hint) doesn't detract from PEP 444 at all, IMHO. -- Ian Bicking | http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Web-SIG] Backup plan: WSGI 1 Addenda and wsgiref update for Py3
On Tue, Sep 21, 2010 at 12:09 PM, P.J. Eby p...@telecommunity.com wrote: The Python 3 specific changes are to use: * ``bytes`` for I/O streams in both directions * ``str`` for environ keys and values * ``bytes`` for arguments to start_response() and write() This is the only thing that seems odd to me -- it seems like the response should be symmetric with the request, and the request in this case uses str for headers (status being header-like), and bytes for the body. Otherwise this seems good to me, the only other major errata I can think of are all listed in the links you included. * text stream for wsgi.errors In other words, strings in, bytes out for headers, bytes for bodies. In general, only changes that don't break Python 2 WSGI implementations are allowed. The changes should also not break mod_wsgi on Python 3, but may make some Python 3 wsgi applications non-compliant, despite continuing to function on mod_wsgi. This is because mod_wsgi allows applications to output string headers and bodies, but I am ruling that option out because it forces every piece of middleware to have to be tested with arbitrary combinations of strings and bytes in order to test compliance. If you want your application to output strings rather than bytes, you can always use a decorator to do that. (And a sample one could be provided in wsgiref.) I agree allowing both is not ideal. -- Ian Bicking | http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Polymorphic best practices [was: (Not) delaying the 3.2 release]
On Fri, Sep 17, 2010 at 3:25 PM, Michael Foord fuzzy...@voidspace.org.ukwrote: On 16/09/2010 23:05, Antoine Pitrou wrote: On Thu, 16 Sep 2010 16:51:58 -0400 R. David Murrayrdmur...@bitdance.com wrote: What do we store in the model? We could say that the model is always text. But then we lose information about the original bytes message, and we can't reproduce it. For various reasons (mailman being a big one), this is not acceptable. So we could say that the model is always bytes. But we want access to (for example) the header values as text, so header lookup should take string keys and return string values[2]. Why can't you have both in a single class? If you create the class using a bytes source (a raw message sent by SMTP, for example), the class automatically parses and decodes it to unicode strings; if you create the class using an unicode source (the text body of the e-mail message and the list of recipients, for example), the class automatically creates the bytes representation. I think something like this would be great for WSGI. Rather than focus on whether bytes *or* text should be used, use a higher level object that provides a bytes view, and (where possible/appropriate) a unicode view too. This is what WebOb does; e.g., there is only bytes version of a POST body, and a view on that body that does decoding and encoding. If you don't touch something, it is never decoded or encoded. I only vaguely understand the specifics here, and I suspect the specifics matter, but this seems applicable in this case too -- if you have an incoming email with a smattering of bytes, inline (2047) encoding, other encoding declarations, and then orthogonal systems like quoted-printable, you don't want to touch that stuff if you don't need to as handling unicode objects implies you are normalizing the content, and that might have subtle impacts you don't know about, or don't want to know about, or maybe just don't fit into the unicode model (like a string with two character sets). Note that WebOb does not have two views, it has only one view -- unicode viewing bytes. I'm not sure I could keep two views straight. I *think* Antoine is describing two possible canonical data types (unicode or bytes) and two views. That sounds hard. -- Ian Bicking | http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 376 proposed changes for basic plugins support
Just to add a general opinion in here: Having worked with Setuptools' entry points, and a little with some Zope pluginish systems (Products.*, which I don't think anyone liked much, and some ways ZCML is used is pluginish), I'm not very excited about these. The plugin system that causes the least confusion and yet seems to accomplish everything it needs is just listing objects in configuration -- nothing gets activated implicitly with installation, and names are Python package/object names without indirection. The only thing I'd want to add is the ability to also point to files, as a common use for plugins is adding ad hoc functionality to an application, and the overhead of package creation isn't always called for. hg for example seems both simple and general enough, and it doesn't use anything fancy. Purely for the purpose of discovery and documentation it might be helpful to have APIs, then some tool could show available plugins (especially if PyPI had a query interface for this), or at least installed plugins, with the necessary code to invoke them. *Maybe* it would make sense to generalize the discovery of plugin types, so that you can simply refer to an object and the application can determine what kind of plugin it is. But having described this, it actually doesn't seem like a useful thing to generalize. -- Ian Bicking | http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Thoughts fresh after EuroPython
On Mon, Jul 26, 2010 at 9:06 AM, Barry Warsaw ba...@python.org wrote: On Jul 24, 2010, at 07:08 AM, Guido van Rossum wrote: privileges enough. So, my recommendation (which surely is a turn-around of my *own* attitude in the past) is to give out more commit privileges sooner. +1, though I'll observe that IME, actual commit privileges become much less of a special badge once a dvcs-based workflow is put in place. In the absence of that, I agree that we have enough checks and balances in place to allow more folks to commit changes Even with DVCS in place, commit privileges allow the person who cares about a change to move it forward, including the more mechanical aspects. E.g. if there are positive reviews of a person's changes in their fork, they can push those changes in. Or more generally, there's a lot of ways of getting approval, but limited commit privileges means all approval must ultimately be funneled through someone with commit. Also different parts of the codebase should have different levels of review and conservativism; e.g., adding clarifications to the docs requires a different level of review than changing stuff in the core. We could try to build that into the tools, but it's a lot easier to make the tools permissive and build these distinctions into social structures. -- Ian Bicking | http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python Language Summit EuroPython 2010
On Wed, Jul 21, 2010 at 10:11 AM, Tim Golden m...@timgolden.me.uk wrote: A discussion on the Cheeseshop / Package Index highlighted the fact that the packaging infrastructure has become increasingly important especially since setuptools, buildout and pip all download from it. Richard produced graphs showing the increase in package downloads over time, and attributed the recent slight tail-off to the fact that the toolchains are now becoming more canny with respect to cacheing and mirroring. Martin Richard confirmed that mirrors are now in place and Marc Andre confirmed that he would be putting together a proposal to have PyPI hosted in the cloud. Guido pointed out that if an AppEngine implementation were desirable, he was sure that AppEngine team would support it with resources as needed. Martin didn't feel that there was a problem with loading on the box in question; it's the uptime that's behind people's concern as it's now so essential to installing and deploying Python applications. From what I've been able to tell from afar, I strongly suspect PyPI's downtimes would be greatly reduced with a move to mod_wsgi (currently it is using mod_fcgi, and most downtime is solved with an Apache restart -- mod_wsgi generally recovers from these problems without intervention). Martin attempted this at one time but ran into some installation problems. It seems like the team of people managing PyPI could benefit from the addition of someone with more of a sysadmin background (e.g., to help with installing a monitor on the server). -- Ian Bicking | http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Removing IDLE from the standard library
On Sun, Jul 11, 2010 at 3:38 PM, Ron Adam r...@ronadam.com wrote: There might be another alternative. Both idle and pydoc are applications (are there others?) that are in the standard library. As such, they or parts of them, are possibly importable to other projects. That restricts changes because a committer needs to consider the chances that a change may break something else. I suggest they be moved out of the lib directory, but still be included with python. (Possibly in the tools directory.) That removes some of the backward compatibility restrictions or at least makes it clear there isn't a need for backward compatibility. I also like this idea. This means Python comes with an IDE out of he box but without the overhead of a management and release process that is built for something very different than a GUI program (the standard library). This would mean that IDLE would be in site-packages, could easily be upgraded using normal tools, and maybe most importantly it could have its own community tools and development process that is more casual (and can more easily integrate new contributors) and higher velocity of changes and releases. Python releases would then ship the most recent stable release of IDLE. -- Ian Bicking | http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes / unicode
On Fri, Jun 25, 2010 at 2:05 AM, Stephen J. Turnbull step...@xemacs.orgwrote: But join('x', 'y') - 'x/y' and join(b'x', b'y') - b'x/y' make sense to me. So, actually, I *don't* understand what you mean by needing LBYL. Consider docutils. Some folks assert that URIs *are* bytes and should be manipulated as such. So base URIs should be bytes. I don't get what you are arguing against. Are you worried that if we make URL code polymorphic that this will mean some code will treat URLs as bytes, and that code will be incompatible with URLs as text? No one is arguing we remove text support from any of these functions, only that we allow bytes. -- Ian Bicking | http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] thoughts on the bytes/string discussion
On Fri, Jun 25, 2010 at 5:06 AM, Stephen J. Turnbull step...@xemacs.orgwrote: So with this idea in mind it makes more sense to me that *specific pieces of text* can be reasonably treated as both bytes and text. All the string literals in urllib.parse.urlunspit() for example. The semantics I imagine are that special('/')+b'x'==b'/x' (i.e., it does not become special('/x')) and special('/')+x=='/x' (again it becomes str). This avoids some of the cases of unicode or str infecting a system as they did in Python 2 (where you might pass in unicode and everything works fine until some non-ASCII is introduced). I think you need to give explicit examples where this actually helps in terms of type contagion. I expect that it doesn't help at all, especially not for the people whose native language for URIs is bytes. These specials are still going to flip to unicode as soon as it comes in, and that will be incompatible with the bytes they'll need later. So they're still going to need to filter out unicode on input. It looks like it would be useful for programmers of polymorphic functions, though. I'm proposing these specials would be used in polymorphic functions, like the functions in urllib.parse. I would not personally use them in my own code (unless of course I was writing my own polymorphic functions). This also makes it less important that the objects be a full stand-in for text, as their use should be isolated to specific functions, they aren't objects that should be passed around much. So you can easily identify and quickly detect if you use unsupported operations on those text-like objects. (This is all a very different use case from bytes+encoding, I think) -- Ian Bicking | http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] thoughts on the bytes/string discussion
On Fri, Jun 25, 2010 at 11:30 AM, Stephen J. Turnbull step...@xemacs.orgwrote: Ian Bicking writes: I'm proposing these specials would be used in polymorphic functions, like the functions in urllib.parse. I would not personally use them in my own code (unless of course I was writing my own polymorphic functions). This also makes it less important that the objects be a full stand-in for text, as their use should be isolated to specific functions, they aren't objects that should be passed around much. So you can easily identify and quickly detect if you use unsupported operations on those text-like objects. OK. That sounds reasonable to me, but I don't see any need for a builtin type for it. Inclusion in the stdlib is not quite a no-brainer, but given Guido's endorsement of polymorphism, I can't bring myself to go lower than +0.9 wink. Agreed on a builtin; I think it would be fine to put something in the strings module, and then in these examples code that used '/' would instead use strings.ascii('/') (not sure so sure of what the name should be though). -- Ian Bicking | http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] thoughts on the bytes/string discussion
On Thu, Jun 24, 2010 at 12:38 PM, Bill Janssen jans...@parc.com wrote: Here are a couple of ideas I'm taking away from the bytes/string discussion. First, it would probably be a good idea to have a String ABC. Secondly, maybe the string situation in 2.x wasn't as broken as we thought it was. In particular, those who deal with lots of encoded strings seemed to find it handy, and miss it in 3.x. Perhaps strings are more like numbers than we think. We have separate types for int, float, Decimal, etc. But they're all numbers, and they all cross-operate. In 2.x, it seems there were two missing features: no encoding attribute on str, which should have been there and should have been required, and the default encoding being ASCII (I can't tell you how many times I've had to fix that issue when a non-ASCII encoded str was passed to some output function). I've started to form a conceptual notion that I think fits these cases. We've setup a system where we think of text as natively unicode, with encodings to put that unicode into a byte form. This is certainly appropriate in a lot of cases. But there's a significant class of problems where bytes are the native structure. Network protocols are what we've been discussing, and are a notable case of that. That is, b'/' is the most native sense of a path separator in a URL, or b':' is the most native sense of what separates a header name from a header value in HTTP. To disallow unicode URLs or unicode HTTP headers would be rather anti-social, especially because unicode is now the native string type in Python 3 (as an aside for the WSGI spec we've been talking about using native strings in some positions like dictionary keys, meaning Python 2 str and Python 3 str, while being more exacting in other areas such as a response body which would always be bytes). The HTTP spec and other network protocols seems a little fuzzy on this, because it was written before unicode even existed, and even later activity happened at a point when unicode and text weren't widely considered the same thing like they are now. But I think the original intention is revealed in a more modern specification like WebSockets, where they are very explicit that ':' is just shorthand for a particular byte, it is not text in our new modern notion of the term. So with this idea in mind it makes more sense to me that *specific pieces of text* can be reasonably treated as both bytes and text. All the string literals in urllib.parse.urlunspit() for example. The semantics I imagine are that special('/')+b'x'==b'/x' (i.e., it does not become special('/x')) and special('/')+x=='/x' (again it becomes str). This avoids some of the cases of unicode or str infecting a system as they did in Python 2 (where you might pass in unicode and everything works fine until some non-ASCII is introduced). The one place where this might be tricky is if you have an encoding that is not ASCII compatible. But we can't guard against every possibility. So it would be entirely wrong to take a string encoded with UTF-16 and start to use b'/' with it. But there are other nonsensical combinations already possible, especially with polymorphic functions, we can't guard against all of them. Also I'm unsure if something like UTF-16 is in any way compatible with the kind of legacy systems that use bytes. Can you encode your filesystem with UTF-16? I don't think you could encode a cookie with it. So maybe having a second string type in 3.x that consists of an encoded sequence of bytes plus the encoding, call it estr, wouldn't have been a bad idea. It would probably have made sense to have estr cooperate with the str type, in the same way that two different kinds of numbers cooperate, promoting the result of an operation only when necessary. This would automatically achieve the kind of polymorphic functionality that Guido is suggesting, but without losing the ability to do x = e(ASCII)bar a = ''.join(foo, x) (or whatever the syntax for such an encoded string literal would be -- I'm not claiming this is a good one) which presume would bind a to a Unicode string foobar -- have to work out what gets promoted to what. I would be entirely happy without a literal syntax. But as Phillip has noted, this can't be implemented *entirely* in a library as there are some constraints with the current str/bytes implementations. Reading PEP 3003 I'm not clear if such changes are part of the moratorium? They seem like they would be (sadly), but it doesn't seem clearly noted. I think there's a *different* use case for things like bytes-in-a-utf8-encoding (e.g., to allow XML data to be decoded lazily), but that could be yet another class, and maybe shouldn't be polymorphicly usable as bytes (i.e., treat it as an optimized str representation that is otherwise semantically equivalent). A String ABC would formalize these things. -- Ian Bicking | http://blog.ianbicking.org
Re: [Python-Dev] thoughts on the bytes/string discussion
On Thu, Jun 24, 2010 at 3:59 PM, Guido van Rossum gu...@python.org wrote: The protocol specs typically go out of their way to specify what byte values they use for syntactically significant positions (e.g. ':' in headers, or '/' in URLs), while hand-waving about the meaning of what goes in between since it is all typically treated as not of syntactic significance. So you can write a parser that looks at bytes exclusively, and looks for a bunch of ASCII punctuation characters (e.g. '', '', '/', ''), and doesn't know or care whether the stuff in between is encoded in Latin-15, MacRoman or UTF-8 -- it never looks inside stretches of characters between the special characters and just copies them. (Sometimes there may be *some* sections that are required to be ASCII and there equivalence of a-z and A-Z is well defined.) Yes, these are the specific characters that I think we can handle specially. For instance, the list of all string literals used by urlsplit and urlunsplit: '//' '/' ':' '?' '#' '' 'http' A list of all valid scheme characters (a-z etc) Some lists for scheme-specific parsing (which all contain valid scheme characters) All of these are constrained to ASCII, and must be constrained to ASCII, and everything else in a URL is treated as basically opaque. So if we turned these characters into byte-or-str objects I think we'd basically be true to the intent of the specs, and in a practical sense we'd be able to make these functions polymorphic. I suspect this same pattern will be present most places where people want polymorphic behavior. For now we could do something incomplete and just avoid using operators we can't overload (is it possible to at least make them produce a readable exception?) I think we'll avoid a lot of the confusion that was present with Python 2 by not making the coercions transitive. For instance, here's something that would work in Python 2: urlunsplit(('http', 'example.com', '/foo', u'bar=baz', '')) And you'd get out a unicode string, except that would break the first time that query string (u'bar=baz') was not ASCII (but not until then!) Here's the urlunsplit code: def urlunsplit(components): scheme, netloc, url, query, fragment = components if netloc or (scheme and scheme in uses_netloc and url[:2] != '//'): if url and url[:1] != '/': url = '/' + url url = '//' + (netloc or '') + url if scheme: url = scheme + ':' + url if query: url = url + '?' + query if fragment: url = url + '#' + fragment return url If all those literals were this new special kind of string, if you call: urlunsplit((b'http', b'example.com', b'/foo', 'bar=baz', b'')) You'd end up constructing the URL b'http://example.com/foo' and then running: url = url + special('?') + query And that would fail because b'http://example.com/foo' + special('?') would be b'http://example.com/foo?' and you cannot add that to the str 'bar=baz'. So we'd be avoiding the Python 2 craziness. -- Ian Bicking | http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes / unicode
On Wed, Jun 23, 2010 at 10:30 AM, Tres Seaver tsea...@palladion.com wrote: Stephen J. Turnbull wrote: We do need str-based implementations of modules like urllib. Why would that be? URLs aren't text, and never will be. The fact that to the eye they may seem to be text-ish doesn't make them text. This *is* a case where dont make me think is a losing propsition: programmers who work with URLs in any non-opaque way as text are eventually going to be bitten by this issue no matter how hard we wave our hands. HTML is text, and URLs are embedded in that text, so it's easy to get a URL that is text. Though, with a little testing, I notice that text alone can't tell you what the right URL really is (at least the intended URL when unsafe characters are embedded in HTML). To test I created two pages, one in Latin-1 another in UTF-8, and put in the link: ./test.html?param=Réunion On a Latin-1 page it created a link to test.html?param=R%E9union and on a UTF-8 page it created a link to test.html?param=R%C3%A9union (the second link displays in the URL bar as test.html?param=Réunion but copies with percent encoding). Though if you link to ./Réunion.html then both pages create UTF-8 links. And both pages also link http://Réunion.comhttp://xn--runion-bva.comto http://xn--runion-bva.com/. So really neither bytes nor text works completely; query strings receive the encoding of the page, which would be handled transparently if you worked on the page's bytes. Path and domain are consistently encoded with UTF-8 and punycode respectively and so would be handled best when treated as text. And of course if you are a page with a non-ASCII-compatible encoding you really must handle encodings before the URL is sensible. Another issue here is that there's no encoding for turning a URL into bytes if the URL is not already ASCII. A proper way to encode a URL would be: (Totally as an aside, as I remind myself of new module names I notice it's not easy to google specifically for Python 3 docs, e.g. python 3 urlsplit gives me 2.6 docs) from urllib.parse import urlsplit, urlunsplit import encodings.idna def encode_http_url(url, page_encoding='ASCII', errors='strict'): scheme, netloc, path, query, fragment = urlsplit(url) scheme = scheme.encode('ASCII', errors) auth = port = None if '@' in netloc: auth, netloc = netloc.split('@', 1) if ':' in netloc: netloc, port = netloc.split(':', 1) netloc = encodings.idna.ToASCII(netloc) if port: netloc = netloc + b':' + port.encode('ASCII', errors) if auth: netloc = auth.encode('UTF-8', errors) + b'@' + netloc path = path.encode('UTF-8', errors) query = query.encode(page_encoding, errors) fragment = fragment.encode('UTF-8', errors) return urlunsplit_bytes((scheme, netloc, path, query, fragment)) Where urlunsplit_bytes handles bytes (urlunsplit does not). It's helpful for me at least to look at that code specifically: def urlunsplit(components): scheme, netloc, url, query, fragment = components if netloc or (scheme and scheme in uses_netloc and url[:2] != '//'): if url and url[:1] != '/': url = '/' + url url = '//' + (netloc or '') + url if scheme: url = scheme + ':' + url if query: url = url + '?' + query if fragment: url = url + '#' + fragment return url In this case it really would be best to have Python 2's system where things are coerced to ASCII implicitly. Or, more specifically, if all those string literals in that routine could be implicitly converted to bytes using ASCII. Conceptually I think this is reasonable, as for URLs (at least with HTTP, but in practice I think this applies to all URLs) the ASCII bytes really do have meaning. That is, '/' (*in the context of urlunsplit*) really is \x2f specifically. Or another example, making a GET request really means sending the bytes \x47\x45\x54 and there is no other set of bytes that has that meaning. The WebSockets specification for instance defines things like colon: http://tools.ietf.org/html/draft-hixie-thewebsocketprotocol-76#page-5 -- in an earlier version they even used bytes to describe HTTP ( http://tools.ietf.org/html/draft-hixie-thewebsocketprotocol-54#page-13), though this annoyed many people. -- Ian Bicking | http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes / unicode
Oops, I forgot some important quoting (important for the algorithm, maybe not actually for the discussion)... from urllib.parse import urlsplit, urlunsplit import encodings.idna # urllib.parse.quote both always returns str, and is not as conservative in quoting as required here... def quote_unsafe_bytes(b): result = [] for c in b: if c 0x20 or c = 0x80: result.extend(('%%%02X' % c).encode('ASCII')) else: result.append(c) return bytes(result) def encode_http_url(url, page_encoding='ASCII', errors='strict'): scheme, netloc, path, query, fragment = urlsplit(url) scheme = scheme.encode('ASCII', errors) auth = port = None if '@' in netloc: auth, netloc = netloc.split('@', 1) if ':' in netloc: netloc, port = netloc.split(':', 1) netloc = encodings.idna.ToASCII(netloc) if port: netloc = netloc + b':' + port.encode('ASCII', errors) if auth: netloc = quote_unsafe_bytes(auth.encode('UTF-8', errors)) + b'@' + netloc path = quote_unsafe_bytes(path.encode('UTF-8', errors)) query = quote_unsafe_bytes(query.encode(page_encoding, errors)) fragment = quote_unsafe_bytes(fragment.encode('UTF-8', errors)) return urlunsplit_bytes((scheme, netloc, path, query, fragment)) -- Ian Bicking | http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes / unicode
On Tue, Jun 22, 2010 at 6:31 AM, Stephen J. Turnbull step...@xemacs.orgwrote: Toshio Kuratomi writes: I'll definitely buy that. Would urljoin(b_base, b_subdir) = bytes and urljoin(u_base, u_subdir) = unicode be acceptable though? Probably. But it doesn't matter what I say, since Guido has defined that as polymorphism and approved it in principle. (I think, given other options, I'd rather see two separate functions, though. Yes. If you want to deal with things like this:: http://host/café http://host/caf%C3%A9 Yes. Just for perspective, I don't know if I've ever wanted to deal with a URL like that. I know how it is supposed to work, and I know what a browser does with that, but so many tools will clean that URL up *or* won't be able to deal with it at all that it's not something I'll be passing around. So from a practical point of view this really doesn't come up, and if it did it would be in a situation where you could easily do something ad hoc (though there is not currently a routine to quote unsafe characters in a URL... that would be helpful, though maybe urllib.quote(url.encode('utf8'), '%/:') would do it). Also while it is problematic to treat the URL-unquoted value as text (because it has an unknown encoding, no encoding, or regularly a mixture of encodings), the URL-quoted value is pretty easy to pass around, and normalization (in this case to http://host/caf%C3%A9) is generally fine. While it's nice to be correct about encodings, sometimes it is impractical. And it is far nicer to avoid the situation entirely. That is, decoding content you don't care about isn't just inefficient, it's complicated and can introduce errors. The encoding of the underlying bytes of a %-decoded URL is largely uninteresting. Browsers (whose behavior drives a lot of convention) don't touch any of that encoding except lately occasionally to *display* some data in a more friendly way. But it's only display, and errors just make it revert to the old encoded display. Similarly I'd expect (from experience) that a programmer using Python to want to take the same approach, sticking with unencoded data in nearly all situations. -- Ian Bicking | http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes / unicode
On Tue, Jun 22, 2010 at 1:07 PM, James Y Knight f...@fuhm.net wrote: The surrogateescape method is a nice workaround for this, but I can't help thinking that it might've been better to just treat stuff as possibly-invalid-but-probably-utf8 byte-strings from input, through processing, to output. It seems kinda too late for that, though: next time someone designs a language, they can try that. :) surrogateescape does help a lot, my only problem with it is that it's out-of-band information. That is, if you have data that went through data.decode('utf8', 'surrogateescape') you can restore it to bytes or transcode it to another encoding, but you have to know that it was decoded specifically that way. And of course if you did have to transcode it (e.g., text.encode('utf8', 'surrogateescape').decode('latin1')) then if you had actually handled the text in any way you may have broken it; you don't *really* have valid text. A lazier solution feels like it would be easier and more transparent to work with. But... I also don't see any major language constraint to having another kind of string that is bytes+encoding. I think PJE brought up a problem with a couple coercion aspects. -- Ian Bicking | http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes / unicode
On Tue, Jun 22, 2010 at 11:17 AM, Guido van Rossum gu...@python.org wrote: (2) Data sources. These can be functions that produce new data from non-string data, e.g. str(int), read it from a named file, etc. An example is read() vs. write(): it's easy to create a (hypothetical) polymorphic stream object that accepts both f.write('booh') and f.write(b'booh'); but you need some other hack to make read() return something that matches a desired return type. I don't have a generic suggestion for a solution; for streams in particular, the existing distinction between binary and text streams works, of course, but there are other situations where this doesn't generalize (I think some XML interfaces have this awkwardness in their API for converting a tree to a string). This reminds me of the optimization ElementTree and lxml made in Python 2 (not sure what they do in Python 3?) where they use str when a string is ASCII to avoid the memory and performance overhead of unicode. Also at least lxml is also dealing with the divide between the internal libxml2 string representation and the Python representation. This is a place where bytes+encoding might also have some benefit. XML is someplace where you might load a bunch of data but only touch a little bit of it, and the amount of data is frequently large enough that the efficiencies are important. -- Ian Bicking | http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python 2.7b1 and argparse's version action
On Sun, Apr 18, 2010 at 6:24 PM, Steven Bethard steven.beth...@gmail.comwrote: On Sun, Apr 18, 2010 at 3:52 PM, Antoine Pitrou solip...@pitrou.net wrote: Steven Bethard steven.bethard at gmail.com writes: Note that even though I agree with you that -v/--version is probably not the best choice, in the poll[2] 11% of people still wanted this. This strikes me as a small minority. Agreed, but it's also the current behavior, and has been since the beginning of argparse. Note that no one complained about it until Tobias filed the issue in Nov 06, 2009. I encountered this problem within minutes of first using argparse. Of course I'm very familiar with optparse and the standard optparse instantiation flies off my fingers without thinking. But then there's going to be a lot more people with that background using argparse once it is in the standard library -- people who don't really care about argparse or optparse but just want to use the standard thing. I don't see any reason why argparse can't simply do exactly what optparse did. There's nothing wrong with it. It's what many people expect. We should just defer to tradition when the choice isn't important (it's getting to be a very bike shed thread). Somewhat relatedly, what is the plan for past and future argparse releases? Michael Foord for instance is releasing unittest improvements in parallel under the name unittest2. I believe there is strong disfavor with releasing packages that overlap with the standard library, so continuing to release argparse under the name argparse will cause problems. I would hate to see release complications or confusions keep argparse from seeing future development. -- Ian Bicking | http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposing PEP 376
On Wed, Apr 7, 2010 at 9:40 AM, Tarek Ziadé ziade.ta...@gmail.com wrote: so for the PEP : - sys.prefix - the installation prefix provided by --prefix at installation time - site-packages - the installation libdir, provided by --install-lib at installation time How do you actually calculate site-packages? Would you store the directory name somewhere? Would you import the module and look at os.path.dirname(os.path.dirname(module.__file__))? Or just scan to see where the module would be? If you store the directory name somewhere then you have another absolute path. This is why, for simplicity, I thought it should be relative to the directory where the record file is (lots of extraneous ../, but the most obvious meaning of a relative filename). -- Ian Bicking | http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposing PEP 376
On Wed, Apr 7, 2010 at 12:45 PM, P.J. Eby p...@telecommunity.com wrote: Examples under debian: docutils/__init__.py - located in /usr/local/lib/python2.6/site-packages/ ../../../bin/rst2html.py - located in /usr/local/bin /etc/whatever - located in /etc I'm wondering if there's really any benefit to having ../../../bin/rst2html.py vs. /usr/local/bin/rst2html.py. Was there a use case for that, or should we just go with relative paths ONLY for children of the libdir? (I only suggested this setup in order to preserve as much of the prefix-relativity proposal as possible, but I wasn't the one who proposed prefix-relativity so I don't recall what the use case is, and I don't even remember who proposed it. I only ever had a usecase for libdir-relativity personally.) Yes, in a virtualenv environment there will be ../../../bin/rst2html.py that will still be under the (virtual) sys.prefix, and the whole bundle can be usefully moved around. -- Ian Bicking | http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Bootstrap script for package management tool in Python 2.7 (Was: Re: [Distutils] At least one package management tool for 2.7)
On Mon, Mar 29, 2010 at 11:26 AM, Larry Hastings la...@hastings.org wrote: anatoly techtonik wrote: So, there won't be any package management tool shipped with Python 2.7 and users will have to download and install `setuptools` manually as before: search - download - unzip - cmd - cd - python setup.py install Therefore I still propose shipping bootstrap package that instruct user how to download and install an actual package management tool when users tries to use it. For what it's worth, Guido prototyped something similar in March of 2008, but his was an actual bootstrapping tool for package management: http://mail.python.org/pipermail/python-dev/2008-March/077837.html His tool knew how to download a tar file, untar it, and run python setup.py install on it. No version numbers, no dependency management, simple enough that it should be easy to get right. Only appropriate for bootstrapping into a real package management tool. The thread ends with him saying I don't have time to deal with this further this week, and I dunno, maybe it just fell off the radar? I'd been thinking about resurrecting the discussion but I didn't have time either. I would consider this bootstrap to be quite workable, though I would add that any extra option to the bootstrap script should be passed to setup.py install, and the download should be cached (so you can do -h and not have to re-download the package once you figure out the extra options -- at least a --user option is reasonable here for people without root). Specifically targeting this bootstrap for tools like pip and virtualenv is no problem. I think looking around PyPI etc is kind of more than I'd bother with. Those things change, this bootstrap code won't change, it could cause unnecessary future pain. Maybe (*maybe*) it could look in http://pypi.python.org/well-known-packages/PACKAGE_NAME and so we can have it install a certain small number of things quickly that way -- if the URL it looks to is targeted only for the bootstrap script itself then we don't have to worry about compatibility problems as much. Oh... then i can think of a half dozen other options it could take, and then it becomes an installer. Blech. OK, I'd be willing to cut off the options at --user (which I think is a minimum... maybe --prefix too), and maybe some simple package detection so people could write python -m boostrap Setuptools --user -- entirely based on some well-known URL baked into bootstrap.py, where the URL is independent of any other service (and so is least likely to cause future problems or ambiguities). An advantage to this kind of bootstrapper is that as future packaging systems are developed there's a clear way to get started with them, without prematurely baking anything in to Python. -- Ian Bicking | http://blog.ianbicking.org | http://twitter.com/ianbicking ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] __file__
The one issue I thought would be resolved by not easily allowing .pyc-only distributions is the case when you rename a file (say module.py to newmodule.py) and there is a module.pyc laying around, and you don't get the ImportError you would expect from import module -- and to make it worse everything basically works, except there's two versions of the module that slowly become different. This regularly causes problems for me, and those problems would get more common and obscure if the pyc files were stashed away in a more invisible location. I can't even tell what the current proposal is; maybe this is resolved? If distributing bytecode required renaming pyc files to .py as Glenn suggested that would resolve the problem quite nicely from my perspective. (Frankly I find the whole use case for distributing bytecodes a bit specious, but whatever.) -- Ian Bicking | http://blog.ianbicking.org | http://twitter.com/ianbicking ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal for virtualenv functionality in Python
be possible to move these environments around without breaking things. That would be compelling. I'm one of those folks who'd like to see this be stackable. If we tweak the semantics just a bit I think it works: * pythonv should inspect its --prefix arguments, as well as passing them on to the child python process it runs. With a config file I'd just expect a list of prefixes being allowed; directly nesting feels unnecessarily awkward. You could use a : (or Windows-semicolon) list just like with PYTHONPATH. * When pythonv wants to run the next python process in line, it scans the path looking for the pythonX.X interpreter but /ignores/ all the interpreters that are in in a --prefix bin directory it's already seen. * python handles multiple --prefix options, and later ones take precedence over earlier ones. * What should sys.interpreter be? Explicit is better than implicit: the first pythonv to run also adds a --interpreter argv[0] to the front of the command-line. Or they could all add it and python only uses the last one. This is one area where python vs python3.2 makes things a little complicated. Ah, yes, the same problem I note above. It should definitely be the thing the person actually typed, or what is in the #! line. I'm at PyCon and would be interested in debating / sprinting on this if there's interest. Yeah, if you see me around, please catch me! -- Ian Bicking | http://blog.ianbicking.org | http://twitter.com/ianbicking ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal for virtualenv functionality in Python
On Fri, Feb 19, 2010 at 10:39 PM, Glenn Linderman v+pyt...@g.nevcal.comv%2bpyt...@g.nevcal.com wrote: On approximately 2/19/2010 1:18 PM, came the following characters from the keyboard of P.J. Eby: At 01:49 PM 2/19/2010 -0500, Ian Bicking wrote: I'm not sure how this should best work on Windows (without symlinks, and where things generally work differently), but I would hope if this idea is more visible that someone more opinionated than I would propose the appropriate analog on Windows. You'd probably have to just copy pythonv.exe to an appropriate directory, and have it use the configuration file to find the real prefix. At least, that'd be a relatively obvious way to do it, and it would have the advantage of being symmetrical across platforms: just copy or symlink pythonv, and make sure the real prefix is in your config file. (Windows does have shortcuts but I don't think that there's any way for a linked program to know *which* shortcut it was launched from.) No automatic way, but shortcuts can include parameters, not just the program name. So a parameter could be --prefix as was suggested in another response, but for a different reason. Windows also has hard-links for files. A lot of Windows tools are completely ignorant of both of those linking concepts... resulting in disks that look to be over capacity when they are not, for example. Virtualenv uses copies when it can't use symlinks. A copy (or hard link) seems appropriate on systems that do not have symlinks. It would seem reasonable that on Windows it might look in the registry to find the actual location where Python was installed. Or... whatever technique Windows people think is best; it's simply necessary that the interpreter know its location (the isolated environment) and also know where Python is installed. All this needs to be calculated in C, as the standard library needs to be on the path very early (so os.symlink wouldn't help, but any C-level function to determine this would be helpful). (It's maybe a bit lame of me that I'm dropping this in the middle of PyCon, as I'm not online frequently during the conference; sorry about that) -- Ian Bicking | http://blog.ianbicking.org | http://twitter.com/ianbicking ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Proposal for virtualenv functionality in Python
need to be aware of this to compile extensions properly (we can be somewhat aware of these cases by looking at places where virtualenv already has problems compiling extensions). Some people have argued for something like sys.prefixes, a list of locations you might look at, which would allow a kind of nesting of these environments (where sys.prefixes[-1] == sys.prefix; or maybe reversed). Personally this seems like it would be hard to keep mental track of this, but I can understand the purpose -- you could for instance create a kind of template prefix that has *most* of what you want installed in it, then create sub-environments that contain for instance an actual application, or a checkout (to test just one new piece of code). I'm not sure how this should best work on Windows (without symlinks, and where things generally work differently), but I would hope if this idea is more visible that someone more opinionated than I would propose the appropriate analog on Windows. -- Ian Bicking | http://blog.ianbicking.org | http://twitter.com/ianbicking ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Improved Traceback Module
On Thu, Jan 28, 2010 at 11:01 AM, s...@pobox.com wrote: pje If you look for a local variable in each frame containing a format pje string, let's say __trace__, you could apply that format string to pje a locals+globals dictionary for the frame, in place of dumping all pje the locals by default I commented on the blog post before noticing all the replies here. I'll embellish that suggestion by suggesting that instance attributes can be as valuable when debugging instance methods. Perhaps __trace_self__ (or similar) could be fed from self.__dict__ if it exists? It seems reasonable to special case the variable named self. You might also want other hooks. For instance in weberror we take a convention from Zope of looking or __traceback_supplement__, which is a factory for an object that informs the traceback (a factory so you don't have to actually instantiate it until there's an error). I then extended its protocol a bit, and use it for putting request information into the traceback. I can imagine two lighter ways to do this. One is something like: __traceback_inspect__ = ['self', 'request'] which indicates those two local variables should be inspected. Another might be some magic method on the request object. Of course if repr(request) is sufficient then you are golden. But it almost certainly isn't sufficient. There's usually key objects that deserve special attention in the case of an error, but which you don't want to flood the output just because you happen to print their repr. (With WebOb actually str(request) would be quite good, while repr(request) would be too brief.) To echo Guido, in my own traceback extensions I have at least a couple levels of try:except: around anything fancy. repr() definitely fails. Unicode errors happen at a lot of different levels (repr() returning unicode, for example). And everything you do may break simply by an error in the code, and you still shouldn't lose at least the old traceback, so putting one big try:except:traceback.print_exc() around your code is also appropriate. Well... not quite appropriate because that would show the exception in the traceback machinery. Instead you should save exc_info and show both tracebacks. Given the amount of data involved you also don't want the traceback to become too hard to read for simple bugs. What is really useful for an unattended process that occasionally fails with unexpected input, may be excessive for development; either it has to be easy to switch on and off, or there needs to be some compromise. In HTML it's easy to make a compromise (put in a little Javascript to hide the extended detail until asked for, for instance). Of course, in some contexts (an email, a web page) the stuff at the top is most visible, and a great place for an abbreviated view, while in other contexts (mostly at a console) the bottom is easiest. Oh, and you even should consider: will you get a unicode error on output? I'd actually suggest returning a unicode subclass that won't ever emit UnicodeEncodeError when it is converted to a str or bytes. Correctness at that stage is not as important as not losing the exception. So... a few suggestions. -- Ian Bicking | http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Executing zipfiles and directories (was Re: PyCon Keynote)
On Tue, Jan 26, 2010 at 1:40 PM, Paul Moore p.f.mo...@gmail.com wrote: 2010/1/26 Nick Coghlan ncogh...@gmail.com: Glenn Linderman wrote: That would seem to go a long ways toward making the facility user friendly, at least on Windows, which is where your complaint about icons was based, and the only change to Python would be to recognize that if a .py contains a .zip signature, That should work today - the zipfile/directory support shouldn't care about the filename at all (although the test suite doesn't currently cover any extensions other than .zip, so I could be wrong about that...). You're right, it works: type __main__.py print Hello from a zip file zip mz.py __main__.py adding: __main__.py (172 bytes security) (stored 0%) mz.py Hello from a zip file Sadly you can't then do: chmod +x mz.py ./mz.py because it doesn't have #!/usr/bin/env python like typical executable Python scripts have. You can put the shebang line at the beginning of the zip file, and zip will complain about it but will still unpack the file, but it won't be runnable as Python won't recognize it as a zip anymore. Now if you could, say, put in #!/usr/bin/env pythonz (and then implement a pythonz command that could do useful stuff) then that might work. Though generally #! is so broken that it's really hard to come up with a reasonable option for these cases. -- Ian Bicking | http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Executing zipfiles and directories (was Re: PyCon Keynote)
On Tue, Jan 26, 2010 at 2:44 PM, Glyph Lefkowitz gl...@twistedmatrix.comwrote: On Jan 26, 2010, at 3:20 PM, Ian Bicking wrote: Sadly you can't then do: chmod +x mz.py ./mz.py Unless I missed some subtlety earlier in the conversation, yes you can :). You are entirely correct; I accidentally was using Python 2.5 in my test. -- Ian Bicking | http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Suggestion: new 3 release with backwards compatibility
On Tue, Jan 5, 2010 at 11:21 AM, Brian Curtin brian.cur...@gmail.comwrote: On Tue, Jan 5, 2010 at 10:10, Juan Fernando Herrera J. juan...@gmail.comwrote: How about a new python 3 release with (possibly partial) backwards compatibility with 2.6? I'm a big 3 fan, but I'm dismayed at the way major software hasn't been ported to it. I'm eager to use 3, but paradoxically, the 3 release makes me rather stuck with 2.6. Excuse me if this has been suggested in the past. The proper route to take, in my opinion, is to see what 2.x libraries you are using that are not 3.x compatible, run 2to3 on them, then run their test suite, and see where you get. Submit a patch or two to the library and see what happens -- it at least gets the wheels in motion. It's not even that easy -- libraries can't apply patches for Python 3 compatibility as they usually break Python 2 compatibility. Potentially libraries could apply patches that make a codebase 2to3 ready, but from what I've seen that's more black magic than straight forward updating, as such patches have to trick 2to3 producing the output that is desired. The only workable workflow I've seen people propose for maintaining a single codebase with compatibility across both 2 and 3 is to use such tricks, with aliases to suppress some 2to3 updates when they are inappropriate, so that you can run 2to3 on install and have a single canonical Python 2 source. Python 2.7 won't help much (even though it is trying) as the introduction of non-ambiguous constructions like b aren't compatible with previous versions of Python and so can't be used in many libraries (support at least back to Python 2.5 is the norm for most libraries, I think). Also, running 2to3 on installation is kind of annoying, as you get source that isn't itself the canonical source, so to fix bugs you have to look at the installed source and trace it back to the bug in the original source. I suspect a reasonable workflow might be possible with hg and maybe patch queues, but I don't feel familiar enough with those tools to map that out. -- Ian Bicking | http://blog.ianbicking.org | http://topplabs.org/civichacker ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Suggestion: new 3 release with backwards compatibility
On Tue, Jan 5, 2010 at 3:07 PM, Martin v. Löwis mar...@v.loewis.de wrote: It's not even that easy -- libraries can't apply patches for Python 3 compatibility as they usually break Python 2 compatibility. Potentially libraries could apply patches that make a codebase 2to3 ready, but from what I've seen that's more black magic than straight forward updating, as such patches have to trick 2to3 producing the output that is desired. I wouldn't qualify it in that way. It may be necessary occasionally to trick 2to3, but that's really a bug in 2to3 which you should report, so that trickery is then a work-around for a bug - something that you may have to do with other API, as well. The black magic is really more in the parts that 2to3 doesn't touch at all (because they are inherently not syntactic); these are the problem areas Guido refers to. The black magic then is to make the same code work unmodified for both 2.x and 3.x. Just to clarify, the black magic I'm referring to is things like: try: unicode_ = unicode except NameError: unicode_ = str and some other aliases like this that are unambiguous and which 2to3 won't touch (if you write them correctly). If the porting guide noted all these tricks (of which several have been developed, and I'm only vaguely aware of a few) that would be helpful. It's not a lot of tricks, but the tricks are not obvious and 2to3 gets the translation wrong pretty often without them. For instance, when I say str in Python 2 I often means bytes, unsurprisingly, but 2to3 translates both str and unicode to str. That *nothing* translates to bytes by default (AFAIK) means that people must either be living in a bytes-free world (which sure, lots of code does) or they are using tricks not included in 2to3 itself. Also replying to Glyph: Also, running 2to3 on installation is kind of annoying, as you get source that isn't itself the canonical source, so to fix bugs you have to look at the installed source and trace it back to the bug in the original source. Given the way tracebacks are built, i.e. from filenames stored in .pycs rather than based on where the code was actually loaded in the filesystem, couldn't 2to3 could do .pyc rewriting to point at the original source? Sort of like our own version of the #line directive? :) Seriously though, I find it hard to believe that this is a big problem. The 3.x source looks pretty similar to the 2.x source, and it's good to look at both if you're dealing with a 3.x issue. Since 2to3 maintains line numbers yes, it wouldn't be that bad. But then I don't currently develop any code that is installed, I only develop code that is directly from a source code checkout, and where the checkout is put on the path. I guess I could have something that automatically builds the code on every edit, and that's not infeasible. It's just not fun. So long as I have to support Python 2 (which is like forever) then adding Python 3 only makes development that much more complicated and much less fun, with no concrete benefits. Which is a terribly crotchety of me. Sorry. -- Ian Bicking | http://blog.ianbicking.org | http://topplabs.org/civichacker ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pronouncement on PEP 389: argparse?
On Mon, Dec 14, 2009 at 12:04 PM, Steven Bethard steven.beth...@gmail.com wrote: So there wasn't really any more feedback on the last post of the argparse PEP other than a typo fix and another +1. I just converted a script over to argparse. It seems nice enough, I was doing a two-level command, and it was quite handy for that. One concern I had is that the naming seems at times trivially different than optparse, just because opt or option is replaced by arg or argument. So .add_option becomes .add_argument, and OptionParser becomes ArgumentParser. This seems unnecessary to me, and it make converting the application harder than it had to be. It wasn't hard, but it could have been really easy. There are a couple other details like this that I think are worth resolving if argparse really is supposed to replace optparse. I'd change this language: The optparse module is deprecated, and has been replaced by the argparse module. To: The optparse module is deprecated and will not be developed further; development will continue with the argparse module There's a lot of scripts using optparse, and if they are successfully using it there's no reason to stop using it. The proposed language seems to imply it is wrong to keep using optparse, which I don't think is the case. And people can pick up on this kind of language and get all excitable about it. -- Ian Bicking | http://blog.ianbicking.org | http://topplabs.org/civichacker ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pronouncement on PEP 389: argparse?
On Mon, Dec 14, 2009 at 12:43 PM, Steven Bethard steven.beth...@gmail.com wrote: On Mon, Dec 14, 2009 at 10:22 AM, Ian Bicking i...@colorstudy.com wrote: On Mon, Dec 14, 2009 at 12:04 PM, Steven Bethard steven.beth...@gmail.com wrote: So there wasn't really any more feedback on the last post of the argparse PEP other than a typo fix and another +1. I just converted a script over to argparse. It seems nice enough, I was doing a two-level command, and it was quite handy for that. One concern I had is that the naming seems at times trivially different than optparse, just because opt or option is replaced by arg or argument. So .add_option becomes .add_argument, and OptionParser becomes ArgumentParser. This seems unnecessary to me, and it make converting the application harder than it had to be. It wasn't hard, but it could have been really easy. There are a couple other details like this that I think are worth resolving if argparse really is supposed to replace optparse. Thanks for the feedback. Could you comment further on exactly what would be sufficient? It would be easy, for example, to add a subclass of ArgumentParser called OptionParser that has an add_option method. Do you also need the following things to work? Well, to argue against myself: having another class like OptionParser also feels like backward compatibility cruft. argparse is close enough to optparse (which is good) that I just wish it was a bit closer. * options, args = parser.parse_args() # options and args aren't separate in argparse This is a substantive enough difference that I don't really mind it, though if OptionParser really was a different class then maybe parse_args should act the same as optparse.OptionParser. What happens if you have positional arguments, but haven't declared any such arguments with .add_argument? Does it just result in an error? I suppose it must. * type='int', etc. # string type names aren't used in argparse This seems simple to support and unambiguous, so yeah. * action='store_false' default value is None # it's True in argparse I don't personally care about this; I agree the None default in optparse is sometimes peculiar (also for action='count' and action='append', where 0 and [] are the sensible defaults). Also I'd like %prog and %default supported, which should be fairly simple; heck, you could just do something like usage.replace('%prog', '%(prog)s') before substitution. Since %prog isn't otherwise valid (unless it was %%prog, which seems unlikely?) this seems easy. Ideally I really wish ArgumentParser was just named OptionParser, and that .add_argument was .add_option, and that argparse's current parse_args was named something different, so both the optparse parse_args (which returns (options, args)) and argparse's different parse_args return value could coexist. Also generally if the common small bits of optparse (like type='int' and %prog) just worked, even if they weren't really extensible in the same manner as optparse. Another thing I just noticed is that argparse using -v for version where optparse does not (it only adds --version); most of my scripts that use -v to mean --verbose, causing problems. Since this is a poll question on the argparse site I assume this is an outstanding question for argparse, but just generally I think that doing things the same way as optparse should be preferred when at all reasonable. -- Ian Bicking | http://blog.ianbicking.org | http://topplabs.org/civichacker ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pronouncement on PEP 389: argparse?
On Mon, Dec 14, 2009 at 6:34 PM, sstein...@gmail.com sstein...@gmail.com wrote: Although I am of the people who think working modules shouldn't be deprecated, I also don't think adding compatibility aliases is a good idea. They only make the APIs more bloated and maintenance more tedious. Let's keep the new APIs clean of any unnecessary baggage. Agreed. If you want to make an adapter to do things like convert 'int' to int, then call the new API then fine, but don't start crufting up a new API to make it 'easier' to convert. All crufting it up does is make it _less_ clear how to use the new API by bring along things that don't belong in it. The new API is almost exactly like the old optparse API. It's not like it's some shining jewel of perfection that would be tainted by somehow being similar to optparse when it's almost exactly like optparse already. If it wasn't like optparse, then fine, whatever; but it *is* like optparse, so these differences feel unnecessary. Converting 'int' to int internally in argparse is hardly difficult or unclear. If argparse doesn't do this, then I think at least it should give good error messages for all cases where these optparse-isms remain. For instance, now if you include %prog in your usage you get: ValueError: unsupported format character 'p' (0x70) at index 1 -- that's simply a bad error message. Giving a proper error message takes about as much code as making %prog work. I don't feel strongly that one is better than the other, but at least one of those should be done. -- Ian Bicking | http://blog.ianbicking.org | http://topplabs.org/civichacker ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Unittest/doctest formatting differences in 2.7a1?
On Wed, Dec 9, 2009 at 11:23 AM, Lennart Regebro lrege...@jarn.com wrote: Evolving the tests to avoid depending on these sorts of implementation details is reasonable, IMO, and cuold even be considered a bugfix by the Zope community. Evolving doctest.py so it can handle this by itself would be considered a bugfix by me. :) It's about time doctest got another run of development anyway. I can imagine a couple features that might help: * Already in there, but sometimes hard to enable, is ellipsis. Can you already do this? throw_an_exception() Traceback (most recent call last): ... DesiredException: ... I'd like to see doctests be able to enable the ELLIPSIS option internally and globally (currently it can only be enabled outside the doctest, or for a single line). * Another option might be something version-specific, like: throw_an_exception() # +python2.7 ... old exception ... throw_an_exception() # +python=2.7 ... new exception ... * Maybe slightly more general, would be the ability to extend OutputCheckers more easily than currently. Maybe for instance # py_version(less=2.7) would enable the py_version output checker, which would always succeed if the version was greater than or equal to 2.7 (effectively ignoring the output). Or, maybe checkers could be extended so they could actually suppress the execution of code (avoiding throw_an_exception() from being called twice). * Or, something more explicit than ELLIPSIS but able also be more flexible than currently possible, like: throw_an_exception() Traceback (most recent call last): ... DesiredException: [[2.6 error message | 2.7 error message]] -- Ian Bicking | http://blog.ianbicking.org | http://topplabs.org/civichacker ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Unittest/doctest formatting differences in 2.7a1?
On Wed, Dec 9, 2009 at 5:47 PM, Paul Moore p.f.mo...@gmail.com wrote: 2009/12/9 Lennart Regebro lrege...@jarn.com: On Wed, Dec 9, 2009 at 18:45, Ian Bicking i...@colorstudy.com wrote: It's about time doctest got another run of development anyway. I can imagine a couple features that might help: * Already in there, but sometimes hard to enable, is ellipsis. Can you already do this? throw_an_exception() Traceback (most recent call last): ... DesiredException: ... I think so, but what you need is: throw_an_exception() Traceback (most recent call last): ... ...DesiredException: ... No you don't. From the manual: When the IGNORE_EXCEPTION_DETAIL doctest option is is specified, everything following the leftmost colon is ignored. So just use #doctest: +IGNORE_EXCEPTION_DETAIL Maybe that could be extended to also ignore everything up to a period (i.e., ignore the module name that seems to show up in 2.7 exception names, but not in previous versions). -- Ian Bicking | http://blog.ianbicking.org | http://topplabs.org/civichacker ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PyPI front page
On Thu, Nov 12, 2009 at 7:52 PM, Antoine Pitrou solip...@pitrou.net wrote: Ben Finney ben+python at benfinney.id.au writes: There's a problem with the poll's placement: on the front page of the PyPI website. Speaking of which, why is it that http://pypi.python.org/pypi and http://pypi.python.org/pypi/ (note the ending slash) return different contents (the latter being very voluminous)? I always mistake one for the other when entering the URL directly. easy_install replied on the behavior of /pypi/ (it uses the long list to do case-insensitive searches). Someone changed it, easy_install broke, and a compromise was to keep /pypi/ the way it was (but not /pypi). Probably this could be removed, as the /simple/ index is already case-insensitive, so easy_install shouldn't have to hit /pypi/ at all. -- Ian Bicking | http://blog.ianbicking.org | http://topplabs.org/civichacker ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Distutils and Distribute roadmap (and some words on Virtualenv, Pip)
On Fri, Oct 9, 2009 at 3:54 AM, kiorky kio...@cryptelium.net wrote: If I had my way, buildout would use virtualenv and throw away its funny script generation. If virtualenv had existed before buildout Which one, the one provided to generate scripts from entry points with the *.egg recipes or the bin/buildout auto regeneration? Well, if multi-versioned installs were deprecated, it would not be necessary to use Setuptools' style of script generation. Instead you could simply dereference the entry point, calling the underlying function directly in the script. This detail is probably more of a distutils-sig question, and I don't have a strong opinion. But I was thinking specifically of the egg activation buildout puts at the top of scripts. began development, probably things would have gone this way. I think it would make the environment more pleasant for buildout users. Also * I don't think so, buildout is the only tool atm that permit to have really reproducible and isolated environments. Even, if you use the pip freezing machinery, it is not equivalent to buildout, Control! I believe that to fully insulate buildout you need still virtualenv --no-site-packages. But I'm not arguing that virtualenv/pip makes buildout obsolete, only that they have overlapping functionality, and I think buildout would benefit from making use of that overlap. * Buildout can have single part to construct required eggs, at a specific version and let you control that. Pip will just search for this version, see that it's not available and fail. You have even recipes (like minitage.recipe.egg that permit to construct eggs with special version when you apply patches onto, thus, you can have the same egg in different flavors in the same eggs cache available for different projects. Those projects will just have to pin the right version to use, Control!. In my own work I use multiple virtualenv environments for this use case, to similar effect. pip of course is not a generalized build tool, but then minitage.recipe.egg is not the main egg installer either. * Another thing is the funny script generation, you have not one global site-packages for your project, but one global cache. But from this global cache, your scripts will only have available the eggs you declared, see Control! * Moreover buildout is not only a python packages manager, it's some of its recipes that permit to use it as. Buildout is just a great deployment tool that allow to script and manage your project in a funny and flexible way, Control! Sure; I'm just advocating that buildout more explicitly use some of the functionality of virtualenv/pip (which may require some more features in those tools, but I'm open to that). But specific discussion of this would probably be more appropriate on distutils-sig. -- Ian Bicking | http://blog.ianbicking.org | http://topplabs.org/civichacker ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Distutils and Distribute roadmap (and some words on Virtualenv, Pip)
On Fri, Oct 9, 2009 at 7:32 AM, Paul Moore p.f.mo...@gmail.com wrote: 2009/10/9 Antoine Pitrou solip...@pitrou.net: Ian Bicking ianb at colorstudy.com writes: Someone mentioned that easy_install provided some things pip didn't; outside of multi-versioned installs (which I'm not very enthusiastic about) I'm not sure what this is? http://pip.openplans.org/#differences-from-easy-install If it's obsolete the website should be updated... Specifically, combine only installs from source with might not work on Windows and the result is pretty certainly unusable for C extensions on Windows. You can pretty much guarantee that the average user on Windows won't have a C compiler[1], and even if they do, they won't be able to carefully line up all the 3rd party C libraries needed to build some extensions. Binary packages are essential on Windows. I'll admit I have some blindness when it comes to Windows. I agree binary installation on Windows is important. (I don't think it's very important on other platforms, or at least not very effective in easy_install so it wouldn't be a regression.) I note some other differences in that document: It cannot install from eggs. It only installs from source. (Maybe this will be changed sometime, but it’s low priority.) Outside of binaries on Windows, I'm still unsure if installing eggs serves a useful purpose. I'm not sure if eggs are any better than wininst binaries either...? It doesn’t understand Setuptools extras (like package[test]). This should be added eventually. I haven't really seen Setuptools' extras used effectively, so I'm unsure if it's a useful feature. I understand the motivation for extras, but motivated features aren't necessarily useful features. It is incompatible with some packages that customize distutils or setuptools in their setup.py files. I don't have a solution for this, and generally easy_install does not perform much better than pip in these cases. Work in Distribute hopefully will apply to this issue. -- Ian Bicking | http://blog.ianbicking.org | http://topplabs.org/civichacker ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Distutils and Distribute roadmap (and some words on Virtualenv, Pip)
Probably all these discussions are better on distutils-sig (just copying python-dev to note the movement of the discussion) On Fri, Oct 9, 2009 at 11:49 AM, Michael Foord fuzzy...@voidspace.org.uk wrote: Outside of binaries on Windows, I'm still unsure if installing eggs serves a useful purpose. I'm not sure if eggs are any better than wininst binaries either...? Many Windows users would be quite happy if the standard mechanism for installing non-source distributions on Windows was via the wininst binaries. I wonder if it is going to be possible to make this compatible with the upcoming distutils package management 'stuff' (querying for installed packages, uninstallation etc) since installation/uninstallation goes through the Windows system package management feature. I guess it would be eminently possible but require some reasonably high level Windows-fu to do. As far as pip works, it unpacks a package and runs python setup.py install (and some options that aren't that interesting, but are provided specifically by setuptools). Well, it's slightly more complicated, but more to the point it doesn't install in-process or dictate how setup.py works, except that it takes some specific options. Running a Windows installer in the same way would be fine, in that sense. Alternately pip could unpack the wininst zip file and install it directly; I'm not sure if that would be better or worse? If wininst uses the central package manager of the OS then certain features (like virtualenv, PYTHONHOME, --prefix, etc) would not be possible. For Distribute (or Setuptools or by association pip) to see that a package is installed, it must have the appropriate metadata. For Setuptools (and Distribute 0.6) this is a directory or file, on sys.path, Package.egg-info (or in Package-X.Y.egg/EGG-INFO). If a file, it should be a PKG-INFO file, if a directory it should contain a PKG-INFO file. So however the package gets installed, if that metadata is installed then it can be queried. I don't think querying the Windows system package management would be necessary or desirable. Nobody is trying that with deb/rpm either. -- Ian Bicking | http://blog.ianbicking.org | http://topplabs.org/civichacker ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Distutils and Distribute roadmap (and some words on Virtualenv, Pip)
. Also virtualenv offers more system isolation. If I had my way, buildout would use virtualenv and throw away its funny script generation. If virtualenv had existed before buildout began development, probably things would have gone this way. I think it would make the environment more pleasant for buildout users. Also I wish it used pip instead of its own installation procedure (based on easy_install). I don't think the philosophical differences are that great, and that it's more a matter of history -- because the code is written, there's not much incentive for buildout to remove that code and rely on other libraries (virtualenv and pip). -- Ian Bicking | http://blog.ianbicking.org | http://topplabs.org/civichacker ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Web-SIG] Adding wsgiref to stdlib
Ian Bicking wrote: Phillip J. Eby wrote: At 02:32 PM 4/28/2006 -0500, Ian Bicking wrote: I'd like to include paste.lint with that as well (as wsgiref.lint or whatever). Since the last discussion I enumerated in the docstring all the checks it does. There's still some outstanding issues, mostly where I'm not sure if it is too restrictive (marked with @@ in the source). It's at: http://svn.pythonpaste.org/Paste/trunk/paste/lint.py Ian, I see this is under the MIT license. Do you also have a PSF contributor agreement (to license under AFL/ASF)? If not, can you place a copy of this under a compatible license so that I can add this to the version of wsgiref that gets checked into the stdlib? I don't have a contributor agreement. I can change the license in place, or sign an agreement, or whatever; someone should just tell me what to do. I faxed in a contributor aggreement, and added this to the comment header of the file: # Also licenced under the Apache License, 2.0: http://opensource.org/licenses/apache2.0.php # Licensed to PSF under a Contributor Agreement ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Web-SIG] Adding wsgiref to stdlib
Phillip J. Eby wrote: At 02:32 PM 4/28/2006 -0500, Ian Bicking wrote: I'd like to include paste.lint with that as well (as wsgiref.lint or whatever). Since the last discussion I enumerated in the docstring all the checks it does. There's still some outstanding issues, mostly where I'm not sure if it is too restrictive (marked with @@ in the source). It's at: http://svn.pythonpaste.org/Paste/trunk/paste/lint.py Ian, I see this is under the MIT license. Do you also have a PSF contributor agreement (to license under AFL/ASF)? If not, can you place a copy of this under a compatible license so that I can add this to the version of wsgiref that gets checked into the stdlib? I don't have a contributor agreement. I can change the license in place, or sign an agreement, or whatever; someone should just tell me what to do. -- Ian Bicking | [EMAIL PROTECTED] | http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Web-SIG] Adding wsgiref to stdlib
Guido van Rossum wrote: PEP 333 specifies WSGI, the Python Web Server Gateway Interface v1.0; it's written by Phillip Eby who put a lot of effort in it to make it acceptable to very diverse web frameworks. The PEP has been well received by web framework makers and users. As a supplement to the PEP, Phillip has written a reference implementation, wsgiref. I don't know how many people have used wsgiref; I'm using it myself for an intranet webserver and am very happy with it. (I'm asking Phillip to post the URL for the current source; searching for it produces multiple repositories.) I believe that it would be a good idea to add wsgiref to the stdlib, after some minor cleanups such as removing the extra blank lines that Phillip puts in his code. Having standard library support will remove the last reason web framework developers might have to resist adopting WSGI, and the resulting standardization will help web framework users. I'd like to include paste.lint with that as well (as wsgiref.lint or whatever). Since the last discussion I enumerated in the docstring all the checks it does. There's still some outstanding issues, mostly where I'm not sure if it is too restrictive (marked with @@ in the source). It's at: http://svn.pythonpaste.org/Paste/trunk/paste/lint.py I think another useful addition would be some prefix-based dispatcher, similar to paste.urlmap (but probably a bit simpler): http://svn.pythonpaste.org/Paste/trunk/paste/urlmap.py The motivation there is to give people the basic tools to simple multi-application hosting, and in the process implicitly suggest how other dispatching can be done. I think this is something that doesn't occur to people naturally, and they see it as a flaw in the server (that the server doesn't have a dispatching feature), and the result is either frustration, griping, or bad kludges. By including a basic implementation of WSGI-based dispatching the standard library can lead people in the right direction for more sophisticated dispatching. And prefix dispatching is also quite useful on its own, it's not just educational. Last time this was brought up there were feature requests and discussion on how industrial strength the webserver in wsgiref ought to be but nothing like the flamefest that setuptools caused (no comments please). No one disagreed with the basic premise though, just some questions about the particulars of the server. I think there were at least a couple small suggestions for the wsgiref server; in particular maybe a slight refactoring to make it easier to use with https. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Web-SIG] Adding wsgiref to stdlib
Guido van Rossum wrote: I think another useful addition would be some prefix-based dispatcher, similar to paste.urlmap (but probably a bit simpler): http://svn.pythonpaste.org/Paste/trunk/paste/urlmap.py IMO this is getting into framework design. Perhaps something like this could be added in 2.6? I don't think it's frameworky. It could be used to build a very primitive framework, but even then it's not a particularly useful starting point. In Paste this would generally be used below any framework (or above I guess, depending on which side is up). You'd pass /blog to a blog app, /cms to a cms app, etc. WSGI already is very specific about what needs to be done when doing this dispatching (adjusting SCRIPT_NAME and PATH_INFO), and that's all that the dispatching needs to do. The applications themselves are written in some framework with internal notions of URL dispatching, but this doesn't infringe upon those. (Unless the framework doesn't respect SCRIPT_NAME and PATH_INFO; but that's their problem, as the dispatcher is just using what's already allowed for in the WSGI spec.) It also doesn't overlap with frameworks, as prefix-based dispatching isn't really that useful in a framework. The basic implementation is: class PrefixDispatch(object): def __init__(self): self.applications = {} def add_application(self, prefix, app): self.applications[prefix] = app def __call__(self, environ, start_response): apps = sorted(self.applications.items(), key=lambda x: -len(x[0])) path_info = environ.get('PATH_INFO', '') for prefix, app in apps: if not path_info.startswith(prefix): continue environ['SCRIPT_NAME'] = environ.get('SCRIPT_NAME', '')+prefix environ['PATH_INFO'] = environ.get('PATH_INFO', '')[len(prefix):] return app(environ, start_response) start_response('404 Not Found', [('Content-type', 'text/html')]) return ['htmlbodyh1Not Found/h1/body/html'] There's a bunch of checks that should take place (most related to /'s), and the not found response should be configurable (probably as an application that can be passed in as an argument). But that's most of what it should do. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Web-SIG] Adding wsgiref to stdlib
Phillip J. Eby wrote: I'd like to include paste.lint with that as well (as wsgiref.lint or whatever). Since the last discussion I enumerated in the docstring all the checks it does. There's still some outstanding issues, mostly where I'm not sure if it is too restrictive (marked with @@ in the source). It's at: http://svn.pythonpaste.org/Paste/trunk/paste/lint.py +1, but lose the unused 'global_conf' parameter and 'make_middleware' functions. Yeah, those are just related to Paste Deploy and wouldn't go in. I think another useful addition would be some prefix-based dispatcher, similar to paste.urlmap (but probably a bit simpler): http://svn.pythonpaste.org/Paste/trunk/paste/urlmap.py I'd rather see something a *lot* simpler - something that just takes a dictionary mapping names to application objects, and parses path segments using wsgiref functions. That way, its usefulness as an example wouldn't be obscured by having too many features. Such a thing would still be quite useful, and would illustrate how to do more sophisticated dispatching. Something more or less like: from wsgiref.util import shift_path_info # usage: #main_app = AppMap(foo=part_one, bar=part_two, ...) class AppMap: def __init__(self, **apps): self.apps = apps def __call__(self, environ, start_response): name = shift_path_info(environ) if name is None: return self.default(environ, start_response) elif name in self.apps: return self.apps[name](environ,start_response) return self.not_found(environ, start_response) def default(self, environ, start_response): self.not_found(environ, start_response) def not_found(self, environ, start_response): # code to generate a 404 response here This should be short enough to highlight the concept, while still providing a few hooks for subclassing. That's mostly what I was thinking, though using a full prefix (instead of just a single path segment), and the default is the application at '', like in my other email. paste.urlmap has several features I wouldn't propose (like domain and port matching, more Paste Deploy stuff, and a proxy object that I should probably just delete); I probably should have been more specific. URLMap's dictionary interface isn't that useful either. Another feature that the example in my other email doesn't have is / handling, specifically redirecting /something-that-matches to /something-that-matches/ (something Apache's Alias doesn't do but should). Host and port matching is pretty easy to do at the same time, and in my experience can be useful to do at the same time, but I don't really care if that feature goes in. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Web-SIG] Adding wsgiref to stdlib
Phillip J. Eby wrote: At 01:19 PM 4/28/2006 -0700, Guido van Rossum wrote: It still looks like an application of WSGI, not part of a reference implementation. Multiple apps looks like an advanced topic to me; more something that the infrastructure (Apache server or whatever) ought to take care of. I'm fine with a super-simple implementation that emphasizes the concept, not feature-richness. A simple dict-based implementation showcases both the wsgiref function for path shifting, and the idea of composing an application out of mini-applications. (The point is to demonstrate how people can compose WSGI applications *without* needing a framework.) But I don't think that this demo should be a prefix mapper; people doing more sophisticated routing can use Paste or Routes. I don't see why not to use prefix matching. It is more consistent with the handling of the default application ('', instead of a method that needs to be overridden), and more general, and the algorithm is only barely more complex and not what I'd call sophisticated. The default application handling in particular means that AppMap isn't really useful without subclassing or assigning to .default. Prefix matching wouldn't show off anything else in wsgiref, because there's nothing else to use; paste.urlmap doesn't use any other part of Paste either (except one unimportant exception) because there's just no need. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Web-SIG] Adding wsgiref to stdlib
Phillip J. Eby wrote: At 04:04 PM 4/28/2006 -0500, Ian Bicking wrote: I don't see why not to use prefix matching. It is more consistent with the handling of the default application ('', instead of a method that needs to be overridden), and more general, and the algorithm is only barely more complex and not what I'd call sophisticated. The default application handling in particular means that AppMap isn't really useful without subclassing or assigning to .default. Prefix matching wouldn't show off anything else in wsgiref, Right, that would be taking away one of the main reasons to include it. That's putting the cart in front of the horse, using a matching algorithm because that's what shift_path_info does, not because it's the most natural or useful way to do the match. I suggest prefix matching not because it shows how the current functions in wsgiref work, but because it shows a pattern of dispatching WSGI applications on a level that is typically (but for WSGI, unnecessarily) built into the server. The educational value is in the pattern, not in the implementation. If you want to show how the functions in wsgiref work, then that belongs in documentation. Which would be good too, people like examples, and the more examples in the wsgiref docs the better. People are much less likely to see examples in the code itself. To make the real dispatcher, I'd flesh out what I wrote a little bit, to handle the default method in a more meaningful way, including the redirect. All that should only add a few lines, however. It will still be only a couple lines less than prefix matching. Another issue with your implementation is the use of keyword arguments for the path mappings, even though path mappings have no association with keyword arguments or valid Python identifiers. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Web-SIG] Adding wsgiref to stdlib
Phillip J. Eby wrote: At 05:47 PM 4/28/2006 -0500, Ian Bicking wrote: It will still be only a couple lines less than prefix matching. That's beside the point. Prefix matching is inherently a more complex concept, and more likely to be confusing, without introducing much in the way of new features. I just don't understand this. It's not more complex. Prefix matching works like: get the prefixes order them longest first check each one against PATH_INFO use the matched app or call the not found handler Name matching works like: get the mapping get the next chunk get the app associated with that chunk use that app or call the not found handler One is not more complex than the other. If I want to dispatch /foo/bar, why not just use: AppMap(foo=AppMap(bar=whatever)) You create an intermediate application with no particular purpose. You get two default handlers, two not found handlers, and you create an object tree that is distracting because it is artificial. Paths are strings, not trees or objects. When you confuse strings for objects you are moving into framework territory. If I was going to include a more sophisticated dispatcher, I'd add an ordered regular expression dispatcher, since that would support use cases that the simple or prefix dispatchers would not, but it would also support the prefix cases without nesting. That is significantly more complex, because SCRIPT_NAME/PATH_INFO cannot be used to express what the regular expression matched. It also overlaps with frameworks. WSGI doesn't offer any standard mechanism to do that sort of thing. It could (e.g., a wsgi.path_vars key), but it doesn't. Or you do something that looks like mod_rewrite, but no one wants that. Prefix based routing represents a real cusp -- more than that, and you have to invent conventions not already present in the WSGI spec, and you overlap with frameworks. Less than that... well, you can't do a whole lot less than that. -- Ian Bicking | [EMAIL PROTECTED] | http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Dropping __init__.py requirement for subpackages
Joe Smith wrote: It seems to me that the right way to fix this is to simply make a small change to the error message. On a failed import, have the code check if there is a directory that would have been the requested package if it had contained an __init__ module. If there is then append a message like You might be missing an __init__.py file. +1. It's not that putting an __init__.py file in is hard, it's that people have a hard time realizing when they've forgotten to do it. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] setuptools in 2.5.
Paul Moore wrote: And no, I don't want to install the 2 versions side-by-side. Ian Bicking complained recently about the uncertainty of multiple directories on sys.path meaning you can't be sure which version of a module you get. Well, having 2 versions of a module installed and knowing that which one is in use depends on require calls which get issued at runtime worries me far more. These are valid concerns. From my own experience, I don't think setuptools makes it any worse than the status quo, but it certainly doesn't magically solve these issues. And though these issues are intrinsically hard, I think Python makes it harder than it should. For instance, if you really want to be confident about how your libraries are layed out, this script is the most reliable way: http://peak.telecommunity.com/dist/virtual-python.py It basically copies all of Python to a new directory. That this is required to get a self-consistent and well-encapsulated Python setup is... well, not good. Maybe this could be fixed for Python 2.5 as well -- to at least make this isolation easier to apply. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] setuptools in 2.5. (summary)
M.-A. Lemburg wrote: Anthony Baxter wrote: In an attempt to help this thread reach some sort of resolution, here's a collection of arguments against and in favour of setuptools in 2.5. My conclusions are at the end. Thanks for the summary. I'd like to add some important aspects (for me at least) that are missing: - setuptools should not change the standard distutils install command to install everything as eggs Eggs are just one distribution format out of many. They do server their purpose, just like RPMs, DEBs or Windows installers do. I think Eggs can be a bit confusing. They really serve two purposes, but using the same format. They are a distribution mechanism, which is probably one of the less important aspects, and there's the installation format. So you don't have to use them as a distribution format to still use them as an installation format. As an installation format they overlap with OS-level metadata, but that OS-level metadata has always been completely unavailable to Python programs so a little duplication has to be put up with. And anyway, the packaging systems can manage the system integrity well enough to keep that information in sync. Even though eggs overlap, they don't have to compete. However, when running python setup.py install you are in fact installing from source, so there's no need to wrap things up again. The distutils default of actually installing things in the standard Python is good, has worked for years and should continue to do so. The extra information needed by the dependency checking can easily be added to the package directory of the installed package or stored elsewhere in a repository of installed packages or as separate egg-info directory if we want to stick with setuptools' way of using the path name for getting meta-information on a package. Phillip can clarify this more, but I believe he's planning on Python 2.5 setuptools to install similar to distutils, but with a sibling .egg-info directory. There's already an option to do this, it's just a matter of whether it will be the default. A package with a sibling .egg-info directory is a real egg, but that it's a real egg probably highlights that eggness can be a bit confusing. Placing the egg-files into the system as ZIP files should be an option, e.g. as separate install_egg command, not the default. I would prefer this too; even though Phillip has fixed the traceback problems for 2.5 I personally just prefer files I can view in other tools as well (my text editor doesn't like zip files, for instance). I typically make this change in distutils.cfg for my own systems. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] setuptools in 2.5.
And now for a little pushback the other way -- as of this January TurboGears has served up 100,000 egg files (I'm not sure what the window for all those downloads is, but it hasn't been very long). Has it occurred to you that they know something you don't about distribution? ElementTree would be among those egg files, so you should also consider how many people *haven't* asked you about problems related to the installation process. Really, I just shouldn't have made this argument; the discussion was going back towards a calmer and more constructive discussion and I pushed it the other way. Sorry. Please ignore. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] setuptools in 2.5.
Paul Moore wrote: 2. Distributors will supply .egg files rather than bdist_wininst installers (this is already happening). Really people should at least be uploading source packages in addition to eggs; it's certainly not hard to do so. Perhaps a distributor quick intro needs to be written for the standard library. Something short; both distutils and setuptools documentation are pretty long, and for someone who just has some simple Python code to get out it's overwhelming. Fredrik also asked for a document, but I don't think it is this document; it wasn't clear to me what exactly he wanted documented. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 359: The make Statement
Steven Bethard wrote: This PEP proposes a generalization of the class-declaration syntax, the ``make`` statement. The proposed syntax and semantics parallel the syntax for class definition, and so:: make callable name tuple: block I can't really see any use case for tuple. In particular, you could always choose to implement this: make Foo someobj(stuff): ... like: make Foo(stuff) someobj: ... I don't think I'd naturally use the tuple position for anything, and so it's an arbitrary and usually empty position in the call, just to support type() which already has its own syntax. So maybe it makes less sense to copy the class/metaclass arguments so closely, and so moving to this might feel a bit better: make someobj Foo(stuff): ... And actually it reminds me more of class statements, which are in the form keyword name(things_you_build_from). Which then obviously leads to more parenthesis: make someobj(Foo(stuff)): ... Except I don't know what make someobj(A, B) would mean, so maybe the parenthesis are uncalled for. I prefer the look of the statement without parenthesis anyway. Really, to me this syntax feels like support for a more prototype-based construct. And many of the class-abusing metaclasses I've used have really looked similar to prototypes. The class statement is caught up in a bunch of very class-like semantics, and a more explicit/manual technique of creating objects opens up lots of potential. With that in mind, I think __call__ might be the wrong method to call on the builder. For instance, if you were actually going to implement prototypes on this, you wouldn't want to steal all uses of __call__ just for the cloning machinery. So __make__ would be nicer. Personally this would also let people using older constructs (like a plain __call__(**kw)) to keep that in addition to supporting this new construct. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 359: The make Statement
BJörn Lindqvist wrote: [nice way to declare properties with make] Of course, properties are only one of the many possible uses of the make statement. The make statement is useful in essentially any situation where a name is associated with a namespace. So, for So far, in this thread that is the only useful use of the make statement that has been presented. I'd like to see more examples. In SQLObject I would prefer: class Foo(SQLObject): make IntCol bar: notNull = True In FormEncode I would prefer: make Schema registration: make String name: max_length = 100 not_empty = True make PostalCode postal_code: not_empty = True make Int age: min = 18 In another thread on the python-3000 list I suggested (using : class Point(object): make setonce x: x coordinate make setonce y: y coordinate For a read-only x and y property (setonce because they have to be set to *something*, but just never re-set). Interfaces are nice: make interface IValidator: make attribute if_empty: If this attribute is not NoDefault, then this value will be used in lieue of an empty value default = NoDefault def to_python(value, state): ... Another descriptor, stricttype (http://svn.colorstudy.com/home/ianb/recipes/stricttype.py): class Pixel(object): make stricttype x: type = int make stricttype y: type = int (Both this descriptor and setonce need to know their name if they are going to store their value in the object in a stable location) It would be really cool if you could go through the standard library, and replace code there with code using the make statement. I think a patch showing how much nicer good Python code would be with the make statement would be a very convincing argument. I don't know if the standard library will have a whole lot; make is really only useful when frameworks are written to use it, and there's just not a lot of framework in the standard library. Maybe: make OptionParser myparser: make Option verbose: short = '-v' help = ... -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 359: The make Statement
Steven Bethard wrote: On 4/13/06, Martin v. Löwis [EMAIL PROTECTED] wrote: Steven Bethard wrote: I know 2.5's not out yet, but since I now have a PEP number, I'm going to go ahead and post this for discussion. Currently, the target version is Python 2.6. You can also see the PEP at: http://www.python.org/dev/peps/pep-0359/ Thanks in advance for the feedback! [snip] Would it be possible/useful to have a pre-block hook to the callable, which would provide the dictionary; this dictionary might not be a proper dictionary (but only some mapping), or it might be pre-initialized. Yeah, something along these lines came up in dicussing using the make statement for XML generation. You might want to write something like: make Element html: make Element head: ... make Element body: ... however, this doesn't work with the current semantics because: (1) dict's are unordered (2) dict's can't have the same name (key) twice Is the body of the make statement going to work like the body of a class statement? I would assume so, in which case (2) would be a given. That is, if you can do: make Element html: title_text = 'foo' make Element title: content = title_text del title_text Then you really can't have multiple keys with the same name unless you give up the ability to refer in the body of the make statement to things defined earlier in that same body. Unless items that were rebound were hidden, but still somehow accessible to Element. and so you can only generate XML/HTML where the order of elements doesn't matter and you never have repeated elements. That's not really XML/HTML anymore. You could probably solve this if you could supply a different type of dict-like object for the block to be executed in. Then we'd have to have a translation from something like:: make callable name tuple in mapping: block to something like:: name = callable(name, tuple, namespace) where namespace is created by executing the statements of block in the mapping object. Skipping the syntax discussion for the moment, I guess I have two problems with this: (1) It complicates the statement semantics pretty substantially (2) It breaks the parallel with the class statement since you can't supply an alternate mapping type for class bodies to be executed in (3) It adds some degree of coupling between the mapping type and the callable. For the example above, I expect I'd have to do something like:: make Element html in ElementDict(): make Element head in ElementDict(): ... make Element body in ElementDict(): ... Maybe Element.__make_dict__ could be ElementDict. This doesn't feel that unclean if you are also using Element.__make__ instead of Element.__call__; though there is a hidden cleverness factor (maybe in a bad way). -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] tally (and other accumulators)
Alex Martelli wrote: It's a bit late for 2.5, of course, but, I thought I'd propose it anyway -- I noticed it on c.l.py. In 2.3/2.4 we have many ways to generate and process iterators but few accumulators -- functions that accept an iterable and produce some kind of summary result from it. sum, min, max, for example. And any, all in 2.5. The proposed function tally accepts an iterable whose items are hashable and returns a dict mapping each item to its count (number of times it appears). This is quite general and simple at the same time: for example, it was proposed originally to answer some complaint about any and all giving no indication of the count of true/false items: tally(bool(x) for x in seq) would give a dict with two entries, counts of true and false items. Just like the other accumulators mentioned above, tally is simple to implement, especially with the new collections.defaultdict: import collections def tally(seq): d = collections.defaultdict(int) for item in seq: d[item] += 1 return dict(d) Or: import collections bag = collections.Bag([1, 2, 3, 2, 1]) assert bag.count(1) == 2 assert bag.count(0) == 0 assert 3 in bag # etc... -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Class decorators
Fred L. Drake, Jr. wrote: It's too bad this syntax is ambiguous: class Foo: Docstring here, blah blah blah @implements(IFoo) As this achieves a desirable highlighting of the specialness, without forcing the decorator outside the class. Oh well. Agreed, but... guess we can't have everything. On the other hand, something like: class Foo: Documentation is good. @class implements(IFoo) is not ambiguous. Hmm. It even says what it means. :-) This is quite reminiscent of Ruby to me, where: class Foo: implements(IFoo) basically means: class Foo: pass Foo.implements(IFoo) For a variety of reasons that doesn't work for Python, but what you propose accomplishes the same basic thing. I'm coming in a little late on all this, but I find moving the decorator inside the class statement to be a substantial improvement, even if it is also a trivial improvement ;) Anytime I've done thought experiments about using class decorators, the results is very hard to read. That classes are inherently declarative and open, while functions are imperative and closed, makes the constructs very different. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] decorator module patch
Georg Brandl wrote: Also, I thought we were trying to move away from modules that shared a name with one of their public functions or classes. As it is, I'm not even sure that a name like decorator gives the right emphasis. I thought about decorators too, that would make decorators.decorator. Hm. I personally like pluralized modules for exactly the reason that they don't clash as much with members or likely local variables. datetime.datetime frequently leads me to make mistakes. In general, decorators belong in the appropriate domain-specific module (similar to context managers). In this case, though, the domain is the manipulation of Python functions - maybe the module should be called metafunctions or functools to reflect its application domain, rather than the coincidental fact that its first member happens to be a decorator. Depends on what else will end up there. If it's memoize or deprecated then the name functools doesn't sound too good either. memoize seems to fit into functools fairly well, though deprecated not so much. functools is similarly named to itertools, another module that is kind of vague in scope (though functools is much more vague). partial would make just as much sense in functools as in functional. -- Ian Bicking | [EMAIL PROTECTED] | http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] multidict API
I'm not really making any actionable proposal here, so maybe this is off-topic; if so, sorry. Back during the defaultdict discussion I proposed a multidict object (http://mail.python.org/pipermail/python-dev/2006-February/061264.html) -- right now I need to implement one to represent web form submissions. It would also be ordered in that case. The question then is what the API should look like for such an object -- an ordered, multi-value dictionary. I would really like if this object was in the collections module, but I'm too lazy to try to pursue that now. But if it did show up, I'd like the class I write to look the same. There's some open questions I see: * Does __getitem__ return a list of all matching keys (never a KeyError, though possibly returning []), or does it return the first matching key? * Either way, I assume there will be another method, like getfirst or getall, that will present the other choice. What would it be named? Should it have a default? * Should there be a method to get a single value, that implicitly asserts that there is only one matching key? * Should the default for .get() be None, or something else? * Does __setitem__ overwrite any or all values with matching keys? * If so, there should be another method like .add(key, value) which does not overwrite. Or, if __setitem__ does not overwrite, then there should be a method that does. * Does __delitem__ raise a KeyError if the key is not found? * Does .keys() return all unique keys, or all keys in order (meaning a key may show up more than once in the list)? I really could go either way on all of these questions, though I think there's constraints -- answer one of the questions and another becomes obvious. But you can answer them in whatever order you want. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] multidict API
Raymond Hettinger wrote: [Ian Bicking] The question then is what the API should look like for such an object -- an ordered, multi-value dictionary. May I suggest that multidict begin it's life as a cookbook recipe so that its API can mature. There's already quite a few recipes out there. But I should probably collect them as well. http://www.voidspace.org.uk/python/odict.html http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/107747 http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/438823 http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/173072 http://urchin.earth.li/~twic/odict.py http://www.astro.washington.edu/owen/ROPython.html http://home.arcor.de/wolfgang.grafen/Python/Modules/Modules.html email.Message.Message http://cvs.eby-sarna.com/wsgiref/src/wsgiref/headers.py?view=markup Well, there's a few, mostly ordered, some multivalue. A comparison would be helpful, but maybe a little later. odict is probably the most filled-out, though it is probably more listish than I really would like. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] quit() on the prompt
Neil Schemenauer wrote: Bad idea, as several pointed out -- quit() should return a 0 exit to the shell. I like the idea of making quit callable. One small concern I have is that people will use it in scripts to exit (rather than one of the other existing ways to exit). OTOH, maybe that's a feature. I actually thought it was only defined for interactive sessions, but a brief test shows I was wrong. It doesn't bother me, but it does make me think that exit(1) should exit with a code of one. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] quit() on the prompt
Frederick suggested a change to quit/exit a while ago, so it wasn't just a string with slight instructional purpose, but actually useful. The discussion was surprisingly involved, despite the change really trully not being that big. And everyone drifted off, too tired from the discussion to make a change. I suppose it didn't help that the original proposal struck some people as too magic, while there were some more substantive problems brought up as well, and when you mix aesthetic with technical concerns everyone gets all distracted and worked up. Anyway, I would like to re-propose one of the ideas that came up (originally from Ping?): class Quitter(object): def __init__(self, name): self.name = name def __repr__(self): return 'Use %s() to exit' % self.name def __call__(self): raise SystemExit() quit = Quitter('quit') exit = Quitter('exit') This is not very magical, but I think is more helpful than the current behavior. It does not satisfy the just do what I said argument for not requiring the call (quit() not quit), but eh -- I guess it seemed like everything that didn't require a call had some scary corner case where the interpreter would abruptly exit. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] quit() on the prompt
BJörn Lindqvist wrote: do { cmd = readline() do_stuff_with_cmd(cmd); } while (!strcmp(cmd, quit)); printf(Bye!); exit(0); KISS? I believe there were concerns that rebinding quit would cause strange behavior. E.g.: quit = False while not quit: ... quit $ Or: if raw_input('quit?') == 'yes': ... quit will that work? Should it? Functions are pretty predictable in comparison to these other options. So, at least to me, quit() == KISS -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] collections.idset and collections.iddict?
Guido van Rossum wrote: On 3/6/06, Raymond Hettinger [EMAIL PROTECTED] wrote: [Neil Schemenauer] I occasionally need dictionaries or sets that use object identity rather than __hash__ to store items. Would it be appropriate to add these to the collections module? Why not decorate the objects with a class adding a method: def __hash__(self): return id(self) That would seem to be more Pythonic than creating custom variants of other containers. I hate to second-guess the OP, but you'd have to override __eq__ too, and probably __ne__ and __cmp__ just to be sure. And probably that wouldn't do -- since the default __hash__ and __eq__ have the desired behavior, the OP is apparently talking about objects that override these operations to do something meaningful; overriding them back presumably breaks other functionality. I wonder if this use case and the frequently requested case-insensitive dict don't have some kind of generalization in common -- perhaps a dict that takes a key function a la list.sort()? That's what occurred to me as soon as I read Neil's post as well. I think it would have the added benefit that it would be case insensitive while still preserving case. Here's a rough idea of the semantics: from UserDict import DictMixin class KeyedDict(DictMixin): def __init__(self, keyfunc): self.keyfunc = keyfunc self.data = {} def __getitem__(self, key): return self.data[self.keyfunc(key)][1] def __setitem__(self, key, value): self.data[self.keyfunc(key)] = (key, value) def __delitem__(self, key): del self.data[self.keyfunc(key)] def keys(self): return [v[0] for v in self.data.values()] I definitely like this more than a key-normalizing dictionary -- the normalized key is never actually exposed anywhere. I didn't follow the defaultdict thing through to the end, so I didn't catch what the constructor was going to look like for that; but I assume those choices will apply here as well. -- Ian Bicking | [EMAIL PROTECTED] | http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] operator.is*Type
Raymond Hettinger wrote: from operator import isSequenceType, isMappingType class anything(object): ... def __getitem__(self, index): ... pass ... something = anything() isMappingType(something) True isSequenceType(something) True I suggest we either deprecate these functions as worthless, *or* we define the protocols slightly more clearly for user defined classes. They are not worthless. They do a damned good job of differentiating anything that CAN be differentiated. But they are just identical...? They seem terribly pointless to me. Deprecation is one option, of course. I think Michael's suggestion also makes sense. *If* we distinguish between sequences and mapping types with two functions, *then* those two functions should be distinct. It seems kind of obvious, doesn't it? I think hasattr(obj, 'keys') is the simplest distinction of the two kinds of collections. Your example simply highlights the consequences of one of Python's most basic, original design choices (using getitem for both sequences and mappings). That choice is now so fundamental to the language that it cannot possibly change. Get used to it. In your example, the results are correct. The anything class can be viewed as either a sequence or a mapping. In this and other posts, you seem to be focusing your design around notions of strong typing and mandatory interfaces. I would suggest that that approach is futile unless you control all of the code being run. I think you are reading too much into it. If the functions exist, they should be useful. That's all I see in Michael's suggestion. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP for Better Control of Nested Lexical Scopes
Mark Russell wrote: On 21 Feb 2006, at 19:25, Jeremy Hylton wrote: If I recall the discussion correctly, Guido said he was open to a version of nested scopes that allowed rebinding. PEP 227 mentions using := as a rebinding operator, but rejects the idea as it would encourage the use of closures. But to me it seems more elegant than some special keyword, especially is it could also replace the global keyword. It doesn't handle things like x += y but I think you could deal with that by just writing x := x + y. By rebinding operator, does that mean it is actually an operator? I.e.: # Required assignment to declare?: chunk = None while chunk := f.read(1000): ... -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
Alex Martelli wrote: I prefer this approach over subclassing. The mental load from an additional method is less than the load from a separate type (even a subclass). Also, avoidance of invariant issues is a big plus. Besides, if this allows setdefault() to be deprecated, it becomes an all-around win. I'd love to remove setdefault in 3.0 -- but I don't think it can be done before that: default_factory won't cover the occasional use cases where setdefault is called with different defaults at different locations, and, rare as those cases may be, any 2.* should not break any existing code that uses that approach. Would it be deprecated in 2.*, or start deprecating in 3.0? Also, is default_factory=list threadsafe in the same way .setdefault is? That is, you can safely do this from multiple threads: d.setdefault(key, []).append(value) I believe this is safe with very few caveats -- setdefault itself is atomic (or else I'm writing some bad code ;). My impression is that default_factory will not generally be threadsafe in the way setdefault is. For instance: def make_list(): return [] d = dict d.default_factory = make_list # from multiple threads: d.getdef(key).append(value) This would not be correct (a value can be lost if two threads concurrently enter make_list for the same key). In the case of default_factory=list (using the list builtin) is the story different? Will this work on Jython, IronPython, or PyPy? Will this be a documented guarantee? Or alternately, are we just creating a new way to punish people who use threads? And if we push threadsafety up to user code, are we trading a very small speed issue (creating lists that are thrown away) for a much larger speed issue (acquiring a lock)? I tried to make a test for this threadsafety, actually -- using a technique besides setdefault which I knew was bad (try:except KeyError:). And (except using time.sleep(), which is cheating), I wasn't actually able to trigger the bug. Which is frustrating, because I know the bug is there. So apparently threadsafety is hard to test in this case. (If anyone is interested in trying it, I can email what I have.) Note that multidict -- among other possible concrete collection patterns (like Bag, OrderedDict, or others) -- can be readily implemented with threading guarantees. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
Steven Bethard wrote: Alternative A: add a new method to the dict type with the semantics of __getattr__ from the last proposal, using default_factory if not None (except on_missing is inlined). I'm not certain I understood this right but (after s/__getattr__/__getitem__) this seems to suggest that for keeping a dict of counts the code wouldn't really improve much: dd = {} dd.default_factory = int for item in items: # I want to do ``dd[item] += 1`` but with a regular method instead # of __getitem__, this is not possible dd[item] = dd.somenewmethod(item) + 1 This would be better done with a bag (a set that can contain multiple instances of the same item): dd = collections.Bag() for item in items: dd.add(item) Then to see how many there are of an item, perhaps something like: dd.count(item) No collections.Bag exists, but of course one should. It has nice properties -- inclusion is done with __contains__ (with dicts it probably has to be done with get), you can't accidentally go below zero, the methods express intent, and presumably it will implement only a meaningful set of methods. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
Guido van Rossum wrote: Why are you so keen on using a dictionary to share data between threads that may both modify it? IMO this is asking for trouble -- the advice about sharing data between threads is always to use the Queue module. I use them often for a shared caches. But yeah, it's harder than I thought at first -- I think the actual cases I'm using work, since they use simple keys (ints, strings), but yeah, thread guarantees are too difficult to handle in general. Damn threads. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Michael Urman wrote: On 2/19/06, Josiah Carlson [EMAIL PROTECTED] wrote: My post probably hasn't convinced you, but much of the confusion, I believe, is based on Martin's original belief that 'k in dd' should always return true if there is a default. One can argue that way, but then you end up on the circular train of thought that gets you to you can't do anything useful if that is the case, .popitem() doesn't work, len() is undefined, Keep it simple, keep it sane. A default factory implementation fundamentally modifies the behavior of the mapping. There is no single answer to the question what is the right behavior for contains, len, popitem as that depends on what the code that consumes the mapping is written like, what it is attempting to do, and what you are attempting to override it to do. Or, simply, on why you are providing a default value. Resisting the temptation to guess the why and just leaving the methods as is seems the best choice; overriding __contains__ to return true is much easier than reversing that behavior would be. I agree that there is simply no universally correct answer for the various uses of default_factory. I think ambiguity on points like this is a sign that something is overly general. In many of the concrete cases it is fairly clear how these methods should work. In the most obvious case (default_factory=list) what seems to be to be the correct implementation is one that no one is proposing, that is, x in d means d.get(x). But that uses the fact that the return value of default_factory() is a false value, which we cannot assume in general. And it effects .keys() -- which I would propose overriding for multidict (so it only returns keys with non-empty lists for values), but I don't see how it could be made correct for default_factory. I just don't see why we should cram all these potential features into dict by using a vague feature like default_factory. Why can't we just add a half-dozen new types of collections (to the module of the same name)? Each one will get its own page of documentation, a name, a proper __repr__, and well defined meaning for all of these methods that it shares with dict only insofar as it makes sense to share. Note that even if we use defaultdict or autodict or something besides changing dict itself, we still won't get a good __contains__, a good repr, or any of the other features that specific collection implementations will give us. Isn't there anyone else who sees the various dict-like objects being passed around as recipes, and thinks that maybe that's a sign they should go in the stdlib? The best of those recipes aren't all-encompassing, they just do one kind of container well. -- Ian Bicking | [EMAIL PROTECTED] | http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Raymond Hettinger wrote: Over lunch with Alex Martelli, he proposed that a subclass of dict with this behavior (but implemented in C) would be a good addition to the language I would like to add something like this to the collections module, but a PEP is probably needed to deal with issues like: * implications of a __getitem__ succeeding while get(value, x) returns x (possibly different from the overall default) * implications of a __getitem__ succeeding while __contains__ would fail * whether to add this to the collections module (I would say yes) * whether to allow default functions as well as default values (so you could instantiate a new default list) * comparing all the existing recipes and third-party modules that have already done this * evaluating its fitness for common use cases (i.e. bags and dict of lists). It doesn't seem that useful for bags, assuming we're talking about an {object: count} implementation of bags; bags should really have a more set-like interface than a dict-like interface. A dict of lists typically means a multi-valued dict. In that case it seems like x[key_not_found] should return the empty list, as that means zero values; even though zero values also means that x.has_key(key_not_found) should return False as well. *but* getting x[key_not_found] does not (for a multi-valued dict) mean that suddently has_key should return true. I find the side-effect nature of __getitem__ as proposed in default_dict to be rather confusing, and when reading code it will very much break my expectations. I assume that attribute access and [] access will not have side effects. Coming at it from that direction, I'm -1, though I'm +1 on dealing with the specific use case that started this (x.setdefault(key, []).append(value)). An implementation targetted specifically at multi-valued dictionaries seems like it would be better. Incidentally, on Web-SIG we've discussed wsgiref, and it includes a mutli-values, ordered, case-insensitive dictionary. Such a dictionary(ish) object has clear applicability for HTTP headers, but certainly it is something I've used many times elsewhere. In a case-sensitive form it applies to URL variables. Really there's several combinations of features, each with different uses. So we have now... dicts: unordered, key:value (associative), single-value sets: unordered, not key:value, single-value lists: ordered, not key:value, multi-value We don't have... bags: unordered, not key:value, multi-value multi-dict: unordered, key:value, multi-value ordered-dict: ordered, key:value, single-value ordered-multi-dict: ordered, key:value, single-value For all key:value collections, normalized keys can be useful. (Though notably the wsgiref Headers object does not have normalized keys, but instead does case-insensitive comparisons.) I don't know where dict-of-dict best fits in here. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The decorator(s) module
Georg Brandl wrote: Hi, it has been proposed before, but there was no conclusive answer last time: is there any chance for 2.5 to include commonly used decorators in a module? One peculiar aspect is that decorators are a programming technique, not a particular kind of functionality. So the module seems kind of funny as a result. Of course not everything that jumps around should go in, only pretty basic stuff that can be widely used. Candidates are: - @decorator. This properly wraps up a decorator function to change the signature of the new function according to the decorated one's. Yes, I like this, and it is purely related to decorators not anything else. Without this, decorators really hurt introspectability. - @contextmanager, see PEP 343. This is abstract enough that it doesn't belong anywhere in particular. - @synchronized/@locked/whatever, for thread safety. Seems better in the threading module. Plus contexts and with make it much less important as a decorator. - @memoize Also abstract, so I suppose it would make sense. - Others from wiki:PythonDecoratorLibrary and Michele Simionato's decorator module at http://www.phyast.pitt.edu/~micheles/python/documentation.html. redirecting_stdout is better implemented using contexts/with. @threaded (which runs the decorated function in a thread) seems strange to me. @blocking seems like it is going into async directions that don't really fit in with decorators (as a general concept). I like @tracing, though it doesn't seem like it is really implemented there, it's just an example? Unfortunately, a @property decorator is impossible... It already works! But only if you want a read-only property. Which is actually about 50%+ of the properties I create. So the status quo is not really that bad. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Counter proposal: multidict (was: Proposal: defaultdict)
I really don't like that defaultdict (or a dict extension) means that x[not_found] will have noticeable side effects. This all seems to be a roundabout way to address one important use case of a dictionary with multiple values for each key, and in the process breaking an important quality of good Python code, that attribute and getitem access not have noticeable side effects. So, here's a proposed interface for a new multidict object, borrowing some methods from Set but mostly from dict. Some things that seemed particularly questionable to me are marked with ??. class multidict: def __init__([mapping], [**kwargs]): Create a multidict: multidict() - new empty multidict multidict(mapping) - equivalent to: ob = multidict() ob.update(mapping) multidict(**kwargs) - equivalent to: ob = multidict() ob.update(kwargs) def __contains__(key): True if ``self[key]`` is true def __getitem__(key): Returns a list of items associated with the given key. If nothing, then the empty list. ??: Is the list mutable, and to what effect? def __delitem__(key): Removes any instances of key from the dictionary. Does not raise an error if there are no values associated. ??: Should this raise a KeyError sometimes? def __setitem__(key, value): Same as: del self[key] self.add(key, value) def get(key, default=[]): Returns a list of items associated with the given key, or if that list would be empty it returns default def getfirst(key, default=None): Equivalent to: if key in self: return self[key][0] else: return default def add(key, value): Adds the value with the given key, so that self[key][-1] == value def remove(key, value): Remove (key, value) from the mapping (raising KeyError if not present). def discard(key, value): Remove like self.remove(key, value), except do not raise KeyError if missing. def pop(key): Removes key and returns the value; returns [] and does nothing if the key is not found. def keys(): Returns all the keys which have some associated value. def items(): Returns [(key, value)] for every key/value pair. Keys that have multiple values will be returned as multiple (key, value) tuples. def __len__(): Equivalent to len(self.items()) ??: Not len(self.keys())? def update(E, **kwargs): if E has iteritems then:: for k, v in E.iteritems(): self.add(k, v) elif E has keys: for k in E: self.add(k, E[k]) else: for k, v in E: self.add(k, v) ??: Should **kwargs be allowed? If so, should it the values be sequences? # iteritems, iterkeys, iter, has_key, copy, popitem, values, clear # with obvious implementations ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Counter proposal: multidict
Guido van Rossum wrote: On 2/17/06, Ian Bicking [EMAIL PROTECTED] wrote: I really don't like that defaultdict (or a dict extension) means that x[not_found] will have noticeable side effects. This all seems to be a roundabout way to address one important use case of a dictionary with multiple values for each key, and in the process breaking an important quality of good Python code, that attribute and getitem access not have noticeable side effects. So, here's a proposed interface for a new multidict object, borrowing some methods from Set but mostly from dict. Some things that seemed particularly questionable to me are marked with ??. Have you seen my revised proposal (which is indeed an addition to the standard dict rather than a subclass)? Yes, and though it is more general it has the same issue of side effects. Doesn't it seem strange that getting an item will change the values of .keys(), .items(), and .has_key()? Your multidict addresses only one use case for the proposed behavior; what's so special about dicts of lists that they should have special support? What about dicts of dicts, dicts of sets, dicts of user-defined objects? What's so special? 95% (probably more!) of current use of .setdefault() is .setdefault(key, []).append(value). Also, since when do features have to address all possible cases? Certainly there are other cases, and I think they can be answered with other classes. Here are some current options: .setdefault() -- works with any subtype; slightly less efficient than what you propose. Awkward to read; doesn't communicate intent very well. UserDict -- works for a few cases where you want to make dict-like objects. Messes up the concept of identity and containment -- resulting objects both are dictionaries, and contain a dictionary (obj.data). DictMixin -- does anything you can possibly want, requiring only the overriding of a couple methods. dict subclassing -- does anything you want as well, but you typically have to override many more methods than with DictMixin (and if you don't have to override every method, that's not documented in any way). Isn't written with subclassing in mind. Really, you are proposing that one specific kind of override be made feasible, either with subclassing or injecting a method. That said, I'm not saying that several kinds of behavior shouldn't be supported. I just don't see why dict should support them all (or multidict). And I also think dict will support them poorly. multidict implements one behavior *well*. In a documented way, with a name people can refer to. I can say multidict, I can't say a dict where I set default_factory to list (well, I can say that, but that just opens up yet more questions and clarifications). Some ways multidict differs from default_factory=list: * __contains__ works (you have to use .get() with default_factory to get a meaningful result) * Barring cases where there are exceptions, x[key] and x.get(key) return the same value for multidict; with default_factory one returns [] and the other returns None when the key isn't found. But if you do x[key]; x.get(key) then x.get(key) always returns []. * You can't use __setitem__ to put non-list items into a multidict; with multidict you don't have to guard against non-sequences values. * [] is meaningful not just as the default value, but as a null value; the multidict implementation respects both aspects. * Specific method x.add(key, value) that indicates intent in a way that x[key].append(value) does not. * items and iteritems return values meaningful to the context (a list of (key, single_value) -- this is usually what I want, and avoids a nested for loop). __len__ also usefully different than in dict. * .update() handles iteritems sensibly, and updates from dictionaries sensibly -- if you mix a default_factory=list dict with a normal (single-value) dictionary you'll get an effectively corrupted dictionary (where some keys are lists) * x.getfirst(key) is useful * I think this will be much easier to reason about in situations with threads -- dict acts very predictably with threads, and people rely upon that * multidict can be written either with subclassing intended, or with an abstract superclass, so that other kinds of specializations of this superset of the dict interface can be made more easily (if DictMixin itself isn't already sufficient) So, I'm saying: multidict handles one very common collection need that dict handles awkwardly now. multidict is a meaningful and useful class with its own identity/name and meaning separate from dict, and has methods that represent both the intersection and the difference between the two classes. multidict does not in any way preclude other collection objects for other situations; it is entirely unfair to expect a new class to solve all issues. multidict suggests an interface that other related classes can use (e.g., an ordered version). multidict
Re: [Python-Dev] Proposal: defaultdict
Guido van Rossum wrote: d = {} d.default_factory = set ... d[key].add(value) Another option would be: d = {} d.default_factory = set d.get_default(key).add(value) Unlike .setdefault, this would use a factory associated with the dictionary, and no default value would get passed in. Unlike the proposal, this would not override __getitem__ (not overriding __getitem__ is really the only difference with the proposal). It would be clear reading the code that you were not implicitly asserting they key in d was true. get_default isn't the best name, but another name isn't jumping out at me at the moment. Of course, it is not a Pythonic argument to say that an existing method should be overridden, or functionality made nameless simply because we can't think of a name (looking to anonymous functions of course ;) -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Guido van Rossum wrote: On 2/17/06, Adam Olsen [EMAIL PROTECTED] wrote: It's also makes it harder to read code. You may expect d[key] to raise an exception, but it won't because of a single line up several pages (or in another file entierly!) Such are the joys of writing polymorphic code. I don't really see how you can avoid this kind of confusion -- I could have given you some other mapping object that does weird stuff. The way you avoid confusion is by not working with code or programmers who write bad code. Python and polymorphic code in general pushes the responsibility for many errors from the language structure onto the programmer -- it is the programmers' responsibility to write good code. Python has never kept people from writing obcenely horrible code. We ought to have an obfuscated Python contest just to prove that point -- it is through practice and convention that readable Python code happens, not through the restrictions of the language. (Honestly, I think such a contest would be a good idea.) I know *I* at least don't like code that mixes up access and modification. Maybe not everyone does (or maybe not everyone thinks of getitem as access, but that's unlikely). I will assert that it is Pythonic to keep access and modification separate, which is why methods and attributes are different things, and why assignment is not an expression, and why functions with side effects typically return None, or have names that are very explicit about the side effect, with names containing command verbs like update or set. All of these distinguish access from modification. Note that all of what I'm saying *only* applies to the overriding of __getitem__, not the addition of any new method. I think multidict is better for the places it applies, but I see no problem at all with a new method on dictionaries that calls on_missing. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]
Martin v. Löwis wrote: Users do py Martin v. Löwis.encode(utf-8) Traceback (most recent call last): File stdin, line 1, in ? UnicodeDecodeError: 'ascii' codec can't decode byte 0xf6 in position 11: ordinal not in range(128) because they want to convert the string to Unicode, and they have found a text telling them that .encode(utf-8) is a reasonable method. What it *should* tell them is py Martin v. Löwis.encode(utf-8) Traceback (most recent call last): File stdin, line 1, in ? AttributeError: 'str' object has no attribute 'encode' I think it would be even better if they got ValueError: utf8 can only encode unicode objects. AttributeError is not much more clear than the UnicodeDecodeError. That str.encode(unicode_encoding) implicitly decodes strings seems like a flaw in the unicode encodings, quite seperate from the existance of str.encode. I for one really like s.encode('zlib').encode('base64') -- and if the zlib encoding raised an error when it was passed a unicode object (instead of implicitly encoding the string with the ascii encoding) that would be fine. The pipe-like nature of .encode and .decode works very nicely for certain transformations, applicable to both unicode and byte objects. Let's not throw the baby out with the bath water. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Martin v. Löwis wrote: I know *I* at least don't like code that mixes up access and modification. Maybe not everyone does (or maybe not everyone thinks of getitem as access, but that's unlikely). I will assert that it is Pythonic to keep access and modification separate, which is why methods and attributes are different things, and why assignment is not an expression, and why functions with side effects typically return None, or have names that are very explicit about the side effect, with names containing command verbs like update or set. All of these distinguish access from modification. Do you never write d[some_key].append(some_value) This is modification and access, all in a single statement, and all without assignment operator. (d[some_key]) is access. (...).append(some_value) is modification. Expressions are compound; of course you can mix both access and modification in a single expression. d[some_key] is access that returns something, and .append(some_value) modifies that something, it doesn't modify d. I don't see the setting of the default value as a modification. The default value has been there, all the time. It only is incarnated lazily. It is lazily incarnated for multidict, because there is no *noticeable* side effect -- if there is any internal side effects that is an implementation detail. However for default_factory=list, the result of .keys(), .has_key(), and .items() changes when you do d[some_key]. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Adam Olsen wrote: The latter is even the prefered form, since it only invokes a single dict lookup: On 2/16/06, Delaney, Timothy (Tim) [EMAIL PROTECTED] wrote: try: v = d[key] except: v = d[key] = value Obviously this example could be changed to use default_factory, but I find it hard to believe the only use of that pattern is to set default keys. I'd go further -- I doubt many cases where try:except KeyError: is used could be refactored to use default_factory -- default_factory can only be used to set default keys to something that can be determined sometime close to the time the dictionary is created, and that the default is not dependent on the context in which the key is fetched, and that default value will not cause unintended side effects if the dictionary leaks out of the code where it was initially used (like if the dictionary is returned to someone). Any default factory is more often an algorithmic detail than truly part of the nature of the dictionary itself. For instance, here is something I do often: try: value = cache[key] except KeyError: ... calculate value ... cache[key] = value Realistically, factoring ... calculate value ... into a factory that calculates the value would be difficult, produce highly unreadable code, perform worse, and have more bugs. For simple factories like list and dict the factory works okay. For immutable values like 0 and None, the factory (lambda : 0 and lambda : None) is a wasteful way to create a default value (because storing the value in the dictionary is unnecessary). For non-trivial factories the whole thing falls apart, and one can just hope that no one will try to use this feature and will instead stick with the try:except KeyError: technique. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]
Josiah Carlson wrote: If some users can't understand this (passing different arguments to a function may produce different output), It's worse than that. The return *type* depends on the *value* of the argument. I think there is little precedence for that: normally, the return values depend on the argument values, and, in a polymorphic function, the return type might depend on the argument types (e.g. the arithmetic operations). Also, the return type may depend on the number of arguments (e.g. by requesting a return type in a keyword argument). You only need to look to dictionaries where different values passed into a function call may very well return results of different types, yet there have been no restrictions on mapping to and from single types per dictionary. Many dict-like interfaces for configuration files do this, things like config.get('remote_host') and config.get('autoconnect') not being uncommon. I think there is *some* justification, if you don't understand up front that the codec you refer to (using a string) is just a way of avoiding an import (thankfully -- dynamically importing unicode codecs is obviously infeasible). Now, if you understand the argument refers to some algorithm, it's not so bad. The other aspect is that there should be something consistent about the return types -- the Python type is not what we generally rely on, though. In this case they are all data. Unicode and bytes are both data, and you could probably argue lists of ints is data too (but an arbitrary list definitely isn't data). On the outer end of data might be an ElementTree structure (but that's getting fishy). An open file object is not data. A tuple probably isn't data. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]
Martin v. Löwis wrote: Ian Bicking wrote: That str.encode(unicode_encoding) implicitly decodes strings seems like a flaw in the unicode encodings, quite seperate from the existance of str.encode. I for one really like s.encode('zlib').encode('base64') -- and if the zlib encoding raised an error when it was passed a unicode object (instead of implicitly encoding the string with the ascii encoding) that would be fine. The pipe-like nature of .encode and .decode works very nicely for certain transformations, applicable to both unicode and byte objects. Let's not throw the baby out with the bath water. The way you use it, it's a matter of notation only: why is zlib(base64(s)) any worse? I think it's better: it doesn't use string literals to denote function names. Maybe it isn't worse, but the real alternative is: import zlib import base64 base64.b64encode(zlib.compress(s)) Encodings cover up eclectic interfaces, where those interfaces fit a basic pattern -- data in, data out. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The decorator(s) module
Alex Martelli wrote: Maybe we could fix that by having property(getfunc) use getfunc.__doc__ as the __doc__ of the resulting property object (easily overridable in more normal property usage by the doc= argument, which, I feel, should almost invariably be there). +1 -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]
Martin v. Löwis wrote: Maybe it isn't worse, but the real alternative is: import zlib import base64 base64.b64encode(zlib.compress(s)) Encodings cover up eclectic interfaces, where those interfaces fit a basic pattern -- data in, data out. So should I write 3.1415.encode(sin) or would that be 3.1415.decode(sin) The ambiguity shows that sin is clearly not an encoding. Doesn't read right anyway. [0.3, 0.35, ...].encode('fourier') would be sensible though. Except of course lists don't have an encode method; but that's just a convenience of strings and unicode because those objects are always data, where lists are only sometimes data. If extended indefinitely, the namespace issue is notable. But it's not going to be extended indefinitely, so that's just a theoretical problem. What about http://www.python.org.decode(URL) you mean 'a%20b'.decode('url') == 'a b'? That's not what you meant, but nevertheless that would be an excellent encoding ;) -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Extension to ConfigParser
Sorry, I didn't follow up here like I should have, and I haven't followed the rest of this conversation, so apologies if I am being redundant... Fuzzyman wrote: While ConfigParser is okay for simple configuration, it is (IMHO) not a very good basis for anyone who wants to build better systems, like config files that can be changed programmatically, or error messages that point to file and line numbers. Those aren't necessarily features we need to expose in the standard library, but it'd be nice if you could implement that kind of feature without having to ignore the standard library entirely. Can you elaborate on what kinds of programattic changes you envisage ? I'm just wondering if there are classes of usage not covered by ConfigObj. Of course you can pretty much do anything to a ConfigObj instance programattically, but even so... ConfigObj does fine, my criticism was simply of ConfigParser in this case. Just yesterday I was doing (with ConfigParser): conf.save('app:main', '## Uncomment this next line to enable authentication:\n#filter-with', 'openid') This is clearly lame ;) That said, I'm not particularly enthused about a highly featureful config file *format* in the standard library, even if I would like a much more robust implementation. I don't see how you can easily separate the format from the parser - unless you just leave raw values. (As I said in the other email, I don't think I fully understand you.) If accessing raw values suits your purposes, why not subclass ConfigParser and do magic in the get* methods ? I guess I haven't really looked closely at the implementation of ConfigParser, so I don't know how serious the subclassing would have to be. But, for example, if you wanted to do nested sections this is not infeasible with the current syntax, you just have to overload the meaning of the section names. E.g., [foo.bar] (a section named foo.bar) could mean that this is a subsection of foo. Or, if the parser allows you to see the order of sections, you could use [[bar]] (a section named [bar]) to imply a subsection, not unlike what you have already, except without the indentation. I think there's lots of other kinds of things you can do with the INI syntax as-is, but providing a different interface to it. If you allow an easy-to-reuse parser, you can even check that syntax at read time. (Or if you keep enough information, check the syntax later and still be able to signal errors with filenames and line numbers) An example of a parser that doesn't imply much of anything about the object being produced is one that I wrote here: http://svn.colorstudy.com/INITools/trunk/initools/iniparser.py On top of that I was able to build some other fancy things without much problem (which ended up being too fancy, but that's a different issue ;) From my light reading on ConfigObj, it looks like it satisfies my personal goals (though I haven't used it), but maybe has too many features, like nested sections. And it seems like maybe the API can be I personally think nested sections are very useful and would be sad to not see them included. Grouping additional configuration options as a sub-section can be *very* handy. Using .'s in names can also do grouping, or section naming conventions. -- Ian Bicking | [EMAIL PROTECTED] | http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Extension to ConfigParser
Fuzzyman wrote: The resolution I'm suggesting means that people can continue to use ConfigParser, with major feature enhancements. *Or* they can migrate to a slightly different API that is easier to use - without needing to switch between incompatible modules. I don't think enhancing ConfigParser significantly is a good way forward. Because of ConfigParser's problems people have made all sorts of workarounds, and so I don't think there's any public interface that we can maintain while changing the internals without breaking lots of code. In practice, everything is a public interface. So I think the implementation as it stands should stay in place, and if anything it should be deprecated instead of being enhanced in-place. Another class or module could be added that fulfills the documented interface to ConfigParser. This would provide an easy upgrade path, without calling it a backward-compatible interface. I personally would like if any new config system included a parser, and then an interface to the configuration that was read (ConfigParser is only the latter). Then people who want to do their own thing can work with just the parser, without crudely extending and working around the configuration interface. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Extension to ConfigParser
Guido van Rossum wrote: I don't think enhancing ConfigParser significantly is a good way forward. Because of ConfigParser's problems people have made all sorts of workarounds, and so I don't think there's any public interface that we can maintain while changing the internals without breaking lots of code. In practice, everything is a public interface. So I think the implementation as it stands should stay in place, and if anything it should be deprecated instead of being enhanced in-place. Somehow that's not my experience. What's so bad about ConfigParser? What would break if we rewrote the save functionality to produce a predictable order? That's a fairly minor improvement, and I can't see how that would break anything. But Michael (aka Fuzzyman -- sorry, I just can't refer to you as Fuzzyman without feeling absurd ;) was proposing ConfigObj specifically (http://www.voidspace.org.uk/python/configobj.html). I assume the internals of ConfigObj bear no particular resemblence to ConfigParser, even if ConfigObj can parse the same syntax (plus some, and with different failure cases) and provide the same public API. While ConfigParser is okay for simple configuration, it is (IMHO) not a very good basis for anyone who wants to build better systems, like config files that can be changed programmatically, or error messages that point to file and line numbers. Those aren't necessarily features we need to expose in the standard library, but it'd be nice if you could implement that kind of feature without having to ignore the standard library entirely. That said, I'm not particularly enthused about a highly featureful config file *format* in the standard library, even if I would like a much more robust implementation. From my light reading on ConfigObj, it looks like it satisfies my personal goals (though I haven't used it), but maybe has too many features, like nested sections. And it seems like maybe the API can be reduced in size with a little high-level refactoring -- APIs generally grow over time so as to preserve backward compatibility, but I think if it was introduced into the standard library that might be an opportunity to trim the API back again before it enters the long-term API freeze that the standard library demands. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Path inherits from string
Fredrik Lundh wrote: However, I might be wrong because according to [1] it should work. And having to wrap the Path object in str() (open(str(somepath))) each and every time the called function expects a string is not a practical solution. in Python, the usual way to access an attribute of an object is to access the attribute; e.g. f = open(p.name) You mean f = open(Path(p).name), because it is likely that people will also have to accept strings for the nearterm (and probably longeterm) future. And the error message without will be inscrutable (and will still be inscrutable in many cases when you try to access other methods, sadly). And currently .name is taken for something else in the API. And the string path is not really an attribute because the string path *is* the object, it is not *part* of the object. OTOH, str(path) will break unicode filenames. And unicode() breaks anything that simply desires to pass data through without effecting its encoding. An open method on paths simplifies many of these issues, but doesn't do anything for passing a path to legacy code. Changing open() and all the functions that Path replaces (e.g., os.path.join) to accept Path objects may resolve issues with a substantial portion of code. But any code that does a typecheck on arguments will be broken -- which in the case of paths is quite common since many functions take both filename and file object arguments. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The path module PEP
BJörn Lindqvist wrote: * match() and matchcase() wraps the fnmatch.fnmatch() and fnmatch.fnmatchcase() functions. I believe that the renaming is uncontroversial and that the introduction of matchcase() makes it so the whole fnmatch module can be deprecated. The renaming is fine with me. I generally use the fnmatch module for wildcard matching, not necessarily against path names. Path.match doesn't replace that functionality. Though fnmatch.translate isn't even publically documented, which is the function I actually tend to use. Though it seems a little confusing to me that glob treats separators specially, and that's not implemented at the fnmatch level. So Path('/a/b/d/c').match('a/*/d') is true, but Path('/').walk('a/*/d') won't return Path('/a/b/c/d'). I think .match() should be fixed. But I don't think fnmatch should be changed. I'm actually finding myself a little confused by the glob arguments (if the glob contains '/'), now that I really think about them. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The path module PEP
John J Lee wrote: On Tue, 24 Jan 2006, Ian Bicking wrote: [...] Losing .open() would make it much harder for anyone wanting to write, say, a URI library that implements the Path API. [...] Why? Could you expand a bit? What's wrong with urlopen(filesystem_path_instance) ? My example shows this more clearly I think: def read_config(path): text = path.open().read() ... do something ... If I implement a URI object with an .open() method, then I can use it with this function, even though read_config() was written with file paths in mind. But without it that won't work: def read_config(path): text = open(path).read() ... -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The path module PEP
Tony Meyer wrote: Remove __div__ (Ian, Jason, Michael, Oleg) This is one of those where everyone (me too) says I don't care either way. If that is so, then I see no reason to change it unless someone can show a scenario in which it hurts readability. Plus, a few people have said that they like the shortcut. * http://mail.python.org/pipermail/python-list/2005-July/292251.html * http://mail.python.org/pipermail/python-dev/2005-June/054496.html * http://mail.python.org/pipermail/python-list/2005-July/291628.html * http://mail.python.org/pipermail/python-list/2005-July/291621.html Well, if you include the much larger discussion on python-list, people (including me) have said that removing __div__ is a good idea. If it's included in the PEP, please at least include a justification and cover the problems with it. The vast majority of people (at least at the time) were either +0 or -0, not +1. +0's are not justification for including something. If it were possible to use .join() for joining paths, I think I wouldn't mind so much. But reusing a string method for something very different seems like a bad idea. So we're left with .joinpath(). Still better than os.path.join() I guess, but only a little. I guess that's why I'm +1 on /. Against it: * Zen: Beautiful is better than ugly. Explicit is better than implicit. Readability counts. There should be one-- and preferably only one --obvious way to do it. I think / is pretty. I think it reads well. There's already some inevitable redundancy in this interface. I use os.path.join so much that I know anything I use will feel readable quickly, but I also think I'll find / more appealing. * Not every platform that Python supports has '/' as the path separator. Windows, a pretty major one, has '\'. I have no idea what various portable devices use, but there's a reasonable chance it's not '/'. I believe all platforms support /; at least Windows and Mac do, in addition to their native separators. I assume any platform that supports filesystem access will support / in Python. If anything, a good shortcut for .joinpath() will at least encourage people to use it, thus discouraging hardcoding of path separators. I expect it would encourage portable paths. Though Path('/foo') / '/bar' == Path('/bar'), which is *not* intuitive, though in the context of join it's not as surprising. So that is a problem. If / meant under this path then that could be a useful operator (in that I'd really like such an operator or method). Either paths would be forced to be under the original path, or it would be an error if they somehow escaped. Currently there's no quick-and-easy way to ensure this, except to join the paths, do abspath(), then confirm that the new path starts with the old path. * It's being used to mean join, which is the exact opposite of /'s other meaning (divide). * Python's not Perl. We like using functions and not symbols. A little too heavy on the truisms. Python isn't the anti-Perl. Renaming methods because of PEP 8 (Gustavo, Ian, Jason) I'm personally not keen on that. I like most of the names as they are. abspath(), joinpath(), realpath() and splitall() looks so much better than abs_path(), join_path(), real_path() and split_all() in my eyes. If someone like the underscores I'll add it to Open Issues. +1 to following PEP 8. These aren't built-ins, it's a library module. In addition to the PEP, underscores make it much easier to read, especially for those for whom English is not their first language. I don't find abs_path() much easier to read than abspath() -- neither is a full name. absolute_path() perhaps, but that is somewhat redundant; absolute()...? Eh. Precedence in naming means something, and in this case all the names have existed for a very long time (as long as Python?) PEP 8 encourages following naming precedence. While I don't see a need to match every existing function with a method, to the degree they do match I see no reason why we shouldn't keep the names. And I see reasons why the names shouldn't be changed. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The path module PEP
BJörn Lindqvist wrote: Remove __div__ (Ian, Jason, Michael, Oleg) This is one of those where everyone (me too) says I don't care either way. If that is so, then I see no reason to change it unless someone can show a scenario in which it hurts readability. Plus, a few people have said that they like the shortcut. * http://mail.python.org/pipermail/python-list/2005-July/292251.html * http://mail.python.org/pipermail/python-dev/2005-June/054496.html * http://mail.python.org/pipermail/python-list/2005-July/291628.html * http://mail.python.org/pipermail/python-list/2005-July/291621.html Curious how often I use os.path.join and division, I searched a project of mine, and in 12k lines there were 34 uses of join, and 1 use of division. In smaller scripts os.path.join tends to show up a lot more (per line). I'm sure there's people who use division far more than I, and os.path.join less, but I'm guessing the majority of users are more like me. That's not necessarily a justification of / for paths, but at least this use for / wouldn't be obscure or mysterious after you get a little experience seeing code that uses it. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The path module PEP
Tony Meyer wrote: [Ian Bicking] If it were possible to use .join() for joining paths, I think I wouldn't mind so much. But reusing a string method for something very different seems like a bad idea. So we're left with .joinpath (). Still better than os.path.join() I guess, but only a little. I guess that's why I'm +1 on /. Why does reusing a string method for something very different seem like a bad idea, but reusing a mathematical operator for something very different seem like a good idea? Path's aren't strings, so join () seems the logical choice. (There are also alternatives to joinpath if the name is the thing: add(), for example). Paths are strings, that's in the PEP. As an aside, I think it should be specified what (if any) string methods won't be inherited by Path (or will be specifically disabled by making them throw some exception). I think .join() and __iter__ at least should be disabled. Precedence in naming means something, and in this case all the names have existed for a very long time (as long as Python?) PEP 8 encourages following naming precedence. While I don't see a need to match every existing function with a method, to the degree they do match I see no reason why we shouldn't keep the names. And I see reasons why the names shouldn't be changed. PEP 8 encourages following naming precedence within a module, doesn't it? Guido has said that he'd like to have the standard library tidied up, at least somewhat (e.g. StringIO.StringIO - stringio.StringIO) for Python 3000. It would make it less painful if new additions already followed the plan. I think the use of underscores or squished words isn't as bit a deal as the case of modules. It's often rather ambiguous what a word really is. At least in English word combinations slowly and ambiguously float towards being combined. So abspath and abs_path both feel sufficiently inside the scope of PEP 8 that precedence is worth maintaining. rfc822's getallmatchingheaders method was going too far, but a little squishing doesn't bother me, if it is consistent (and it's actually easier to be consistent about squishing than underscores). -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] / as path join operator
Steven Bethard wrote: My only fear with the / operator is that we'll end up with the same problems we have for using % in string formatting -- the order of operations might not be what users expect. Since join is conceptually an addition-like operator, I would expect: Path('home') / 'a' * 5 to give me: home/a If I understand it right, it would actually give me something like: home/ahome/ahome/ahome/ahome/a Both of these examples are rather silly, of course ;) There's two operators currently used commonly with strings (that I assume Path would inherit): + and %. Both actually make sense with paths too. filename_template = '%(USER)s.conf' p = Path('/conf') / filename_template % os.environ which means: p = (Path('/conf') / filename_template) % os.environ But probably the opposite is intended. Still, it will usually be harmless. Which is sometimes worse than usually harmful. + seems completely innocuous, though: ext = '.jpg' name = fields['name'] image = Path('/images') / name + ext It doesn't really matter what order it happens in there. Assuming concatenation results in a new Path object, not a str. -- Ian Bicking | [EMAIL PROTECTED] | http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The path module PEP
Barry Warsaw wrote: On Wed, 2006-01-25 at 18:10 -0600, Ian Bicking wrote: Paths are strings, that's in the PEP. As an aside, I think it should be specified what (if any) string methods won't be inherited by Path (or will be specifically disabled by making them throw some exception). I think .join() and __iter__ at least should be disabled. Whenever I see derived classes deliberately disabling base class methods, I see red flags that something in the design of the hierarchy isn't right. IMHO the hierarchy problem is a misdesign of strings; iterating over strings is usually a bug, not a deliberately used feature. And it's a particularly annoying bug, leading to weird results. In this case a Path is not a container for characters. Strings aren't containers for characters either -- apparently they are containers for smaller strings, which in turn contain themselves. Paths might be seen as a container for other subpaths, but I think everyone agrees this is too ambigous and implicit. So there's nothing sensible that __iter__ can do, and having it do something not sensible (just to fill it in with something) does not seem very Pythonic. join is also a funny method that most people wouldn't expect on strings anyway. But putting that aside, the real issue I see is that it is a miscognate for os.path.join, to which it has no relation. And I can't possibly imagine what you'd use it for in the context of a path. -- Ian Bicking | [EMAIL PROTECTED] | http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The path module PEP
Gustavo J. A. M. Carneiro wrote: On a slightly different subject, regarding path / path, I think it feels much more natural path + path. Path.join is really just a string concatenation, except that it adds a path separator in the middle if necessary, if I'm not mistaken. No, it isn't, which maybe is why / is bad. os.path.join(a, b) basically returns the path as though b is interpreted to be relative to a. I.e., os.path.join('/foo', '/bar') == '/bar'. Not much like concatenation at all. Plus string concatenation is quite useful with paths, e.g., to add an extension. If a URI class implemented the same methods, it would be something of a question whether uri.joinpath('/foo/bar', 'baz') would return '/foo/baz' (and urlparse.urljoin would) or '/foo/bar/baz' (as os.path.join does). I assume it would be be the latter, and urljoin would be a different method, maybe something novel like urljoin. -- Ian Bicking | [EMAIL PROTECTED] | http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com