Re: [Python-Dev] Python-3.0, unicode, and os.environ
Martin v. Löwis wrote: Please, if you have a *new* idea that doesn't have a failure mode, by all means post it. But don't resurrect a pointless bikeshed. While I completely agree that it is pointless to reiterate the same arguments over and over, I disagree that the bikeshed metapher applies. This metapher (IIUC) describes a trivial design issue that is merely a matter of taste, rather than having deep technical implications. Using Unicode or bytes for strings is not of that kind. +1 These issues are very important because they affect everyone. Even though very few people actually understand them. Including me, which is why I've been so quiet on this thread. regards Steve -- Steve Holden+1 571 484 6266 +1 800 494 3119 Holden Web LLC http://www.holdenweb.com/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] RELEASED Python 3.0 final
Barry Warsaw schrieb: On Dec 4, 2008, at 6:21 PM, Martin v. Löwis wrote: I can't find any docs built for Python 3.0 (not 3.1a0). The Windows installation has new 3.0 doc dated Dec 3, so it was built, just not posted correctly. That doesn't mean very much. I built it on my local machine. Anybody with subversion and python could do that; the documentation is in subversion. Whether or not it appears on the web site as part of the release process is an entirely different matter. It used to be that the doc maintainer (Fred Drake) was part of the release team and release process. I think Georg is complaining that he is release maintainer, but not part of the release process. I've asked Georg to update PEP 101 to make his role as Documentation Expert explicit. Unfortunately we only debug major releases once (or twice) every 18 months. But next time, we'll get that part right for sure! Done that now. Since release.py builds the docs all right, there's not much left for me to do except check that everything is ok. In the meantime, I'll make sure Georg is involved in point releases moving forward. That's good. Thanks! Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python-3.0, unicode, and os.environ
On Fri, Dec 5, 2008 at 12:00 AM, Martin v. Löwis [EMAIL PROTECTED] wrote: Please, if you have a *new* idea that doesn't have a failure mode, by all means post it. But don't resurrect a pointless bikeshed. While I completely agree that it is pointless to reiterate the same arguments over and over, I disagree that the bikeshed metapher applies. This metapher (IIUC) describes a trivial design issue that is merely a matter of taste, rather than having deep technical implications. Using Unicode or bytes for strings is not of that kind. That we need to support both unicode and bytes is important, but already seems to have consensus. However, they present two distinct usage patterns: * unicode text, presentable to the user, interacts with all manor of standardized APIs * bytes, limited to local, internal use. Only approximated forms can be presented to the user, only custom formats can be saved externally None of the proposals have turned these into a single use case. All they do is trade off various forms of subtly switch back and forth, which leads to failure. Debating which subtle failure is better is a bikeshed. Not only that, but we already have a solution that makes the choice explicit, avoiding the subtle failure. This is the solution already in use for os file path functions. It's the solution Guido supports. -- Adam Olsen, aka Rhamphoryncus ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Taint Mode in Python 3.0
Maciej Fijalkowski wrote: Hello, The thing is pypy's taint code is broken. Basically you don't only need to patch all places that return pyobject, but also all places that might modify anything. (All side effects) For example innocently looking call to addition might end up calling arbitrary python code (and have arbitrary side effects). There is a question how do you approach such things? Taint isn't an easy problem, but PyPy is still a *much* better platform for that kind of experimentation than CPython. RPython, objects spaces, the code generation, etc all give you much more powerful tools to play with than the raw C code of the reference interpreter. Cheers, Nick. -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python-3.0, unicode, and os.environ
[EMAIL PROTECTED] wrote: At least this time I think I've encapsulated pretty much my entire argument here, so if you don't buy it, we can probably just agree to disagree :). Glyph, the only point I would add to your message is this one: Adding a blessed way to encode arbitrary binary data into a Python 3.0 str object strikes me as giving up on one of the key advances in the new version of the language. 8-bit strings were a problem in Python 2.x because they blurred the boundary between arbitrary binary data and ASCII or latin-1 character data. One of the most interesting aspects of Python 3.0 is its attempt to get developers to be explicit about this distinction (both in the code and in their own minds) by enforcing separation between arbitrary binary data (held in bytes and bytearray instances) and character data (held in str instances). I don't understand how tunneling arbitrary binary data through str instances (*regardless* of encoding mechanism) can possibly fail to recreate exactly the same is it text or binary data? ambiguity problems that the str/bytes split is intended to eliminate. And if that happens, then what exactly was the point in moving to an all Unicode string model for Py3k? Cheers, Nick. -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python-3.0, unicode, and os.environ
Le Friday 05 December 2008 00:39:24 Martin v. Löwis, vous avez écrit : 5) represent all environment variables in Unicode strings, including the ones that currently fail to decode. (then do the same to file names, then drop the byte-oriented file operations again) Please, don't do that! Bytes are not characters! -- Victor Stinner aka haypo http://www.haypocalc.com/blog/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python-3.0, unicode, and os.environ
On Friday 05 December 2008, Adam Olsen wrote: Many of the windows APIs use UTF-16 without validating it. They'll pass through invalid strings until they hit something that does validate, at which point it'll blow up. I suspect that it doesn't happen very often in practice, as having only one encoding makes it quite clear that it's a broken file name, not a mixed encoding environment. Actually, I wouldn't say that's a problem at all. The point is that stuff that is blissfully unaware of encodings typically uses some ASCII-de(p)rived text. Those char-strings are translated according to the current locale, which then does the filtering and validation. The result may be gibberish (GIGO principle) but at least it's UTF-16 gibberish. ;) Uli -- Sator Laser GmbH Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932 ** Visit our website at http://www.satorlaser.de/ ** Diese E-Mail einschließlich sämtlicher Anhänge ist nur für den Adressaten bestimmt und kann vertrauliche Informationen enthalten. Bitte benachrichtigen Sie den Absender umgehend, falls Sie nicht der beabsichtigte Empfänger sein sollten. Die E-Mail ist in diesem Fall zu löschen und darf weder gelesen, weitergeleitet, veröffentlicht oder anderweitig benutzt werden. E-Mails können durch Dritte gelesen werden und Viren sowie nichtautorisierte Änderungen enthalten. Sator Laser GmbH ist für diese Folgen nicht verantwortlich. ** ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python-3.0, unicode, and os.environ
Hi, Le Thursday 04 December 2008 21:02:19 Toshio Kuratomi, vous avez écrit : I opened up bug http://bugs.python.org/issue4006 a while ago and it was suggested in the report that it's not a bug but a feature and so I should come here to see about getting the feature changed :-) Yeah, I prefer to discuss such changes on the mailing list. These mixed encodings can occur for a variety of reasons. Here's an example that isn't too contrived :-) (...) Furthermore, they don't want to suffer from the space loss of using utf-8 to encode Japanese so they use shift-jis everywhere. space loss? Really? If you configure your server correctly, you should get UTF-8 even if the file system is Shift-JIS. But it would be much easier to use UTF-8 everywhere. Hum... I don't think that the discussion is about one specific server, but the lack of bytes environment variables in Python3 :-) 1) return mixed unicode and byte types in ... NO! 2) return only byte types in os.environ Hum... Most users have UTF-8 everywhere (eg. all Windows users ;-)), and Python3 already use Unicode everywhere (input(), open(), filenames, ...). 3) silently ignore non-decodable value when accessing os.environ['PATH'] as we do now but allow access to the full information via os.environ[b'PATH'] and os.getenvb() I don't like os.environ[b'PATH']. I prefer to always get the same result type... But os.listdir() doesn't respect that :-( os.listdir(str) - list of str os.listdir(bytes) - list of bytes I would prefer a similar API for easier migration from Python2/Python3 (unicode). os.environb sounds like the best choice for me. But they are open questions (already asked in the bug tracker): (a) Should os.environ be updated if os.environb is changed? If yes, how? os.environb['PATH'] = '\xff' (or any invalid string in the system default encoding) = os.environ['PATH'] = ??? (b) Should os.environb be updated if os.environ is changed? If yes, how? The problem comes with non-Unicode locale (eg. latin-1 or ASCII): most charset are unable to encode the whole Unicode charset (eg. codes = 65535). os.environ['PATH'] = chr(0x1) = os.environb['PATH'] = ??? (c) Same question when a key is deleted (del os.environ['PATH']). If Python 3.1 will have os.environ and os.environb, I'm quite sure that some modules will user os.environ and other will prefer os.environb. If both environments are differents, the two modules set will work differently :-/ It would be maybe easier if os.environ supports bytes and unicode keys. But we have to keep these assertions: os.environ[bytes] - bytes os.environ[str] - str 4) raise an exception when non-decodable values are *accessed* and continue as in #3. I like os.listdir() behaviour: just *ignore* non-decodable files. If you really want to access these files, use a bytes directory name ;-) I think that the ease of debugging is lost when we silently ignore an error. Guido gave a good example. If your directory contains an non decodable filename (eg. ???.txt): glob('*.py') will fail because of the evil filename. With the current behaviour, you're unable to list all files but glob('*.py') will list all Python scripts! And Python3 is released, it's maybe a bad idea to change the behaviour (of os.environ) in Python 3.1 :-/ The bug report I opened suggests creating a PEP to address this issue. Please, try to answer to my questions about os.environ and os.environb consistency. I also like bytes environment variables. I need them for my fuzzing program. The lack of bytes variables is a regression from Python2 (for my program). On UNIX, filenames are bytes and the environment variables are bytes. For the best interoperability, Python3 should support bytes. But the default choice should always be characters (unicode) and to never mix the bytes and str types ;-) --- As usual, it goes faster if someone writes a patch :-) I could try to work on it. -- Victor Stinner aka haypo http://www.haypocalc.com/blog/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python-3.0, unicode, and os.environ
On Friday 05 December 2008, Guido van Rossum wrote: At the risk of bringing up something that was already rejected, let me propose something that follows the path taken in 3.0 for filenames, rather than doubling back: For os.environ, os.getenv() and os.putenv(), I think a similar approach as used for os.listdir() and os.getcwd() makes sense: let os.environ skip variables whose name or value is undecodable, and have a separate os.environb() which contains bytes; let os.getenv() and os.putenv() do the right thing when the arguments passed in are bytes. For sys.argv, because it's positional, you can't skip undecodable values, so I propose to use error=replace for the decoding; again, we can add sys.argvb that contains the raw bytes values. The various os.exec*() and os.spawn*() calls (as well as os.system(), os.popen() and the subprocess module) should all accept bytes as well as strings. On Windows, the bytes APIs should probably not exist. I predict that most developers can get away with not using the bytes APIs at all. The small minority that needs to be robust if not all filenames use the system encoding can use the bytes APIs. I know some of those developers, you can contact them via [EMAIL PROTECTED] Seriously, what would you suggest to someone that wants to handle paths in a portable way? Using the Unicode variants of functions is fubar, because encoding/decoding is not universally possible. Using the byte variant is equally fubar, because e.g. on MS Windows it is not supported, except through a very lossy roundtrip through the locale's codepage, limiting your functionality. I actually think it is about time to give up on trying to think about a path as a string. Dito for data received from os.environ or sys.argv. There are only very few things that are universal to them and a reliable encoding is none of them. Then, once you have let that idea go, meditate a bit over the Zen. What I propose is that paths must be treated as OS-specific, with the only common reliable operations being joining them, concatenating them and splitting them into segments divided by the (again, OS-specific) separator. Other operations, like e.g. appending a string or converting it to a string in order to display it can fail. And if they fail, they should fail noisily. In 99% of all cases, using the default encoding will work and do what people expect, which is why I would make this conversion automatic. In all other cases, it will at least not fail silently (which would lead to garbage and data loss) and allow more sophisticated applications to handle it. Uli -- Sator Laser GmbH Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932 ** Visit our website at http://www.satorlaser.de/ ** Diese E-Mail einschließlich sämtlicher Anhänge ist nur für den Adressaten bestimmt und kann vertrauliche Informationen enthalten. Bitte benachrichtigen Sie den Absender umgehend, falls Sie nicht der beabsichtigte Empfänger sein sollten. Die E-Mail ist in diesem Fall zu löschen und darf weder gelesen, weitergeleitet, veröffentlicht oder anderweitig benutzt werden. E-Mails können durch Dritte gelesen werden und Viren sowie nichtautorisierte Änderungen enthalten. Sator Laser GmbH ist für diese Folgen nicht verantwortlich. ** ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Fix for frame_setlineno() in frameobject.c function
Hello, This concerns a known bug in the frame_setlineno() function for Python 2.5.x and 2.6.x (maybe in earlier version too). It is not possible to use this function when the address or line offset are greater than 127. The problem comes from the lnotab variable which is typed char*, therefore implicitely signed char*. Any value above 127 becomes a negative number. The fix is very simple (applied on the Python 2.6.1 version of the source code): --- frameobject.c Thu Oct 02 19:39:50 2008 +++ frameobject_fixed.c Fri Dec 05 11:27:42 2008 @@ -119,8 +119,8 @@ line = f-f_code-co_firstlineno; new_lasti = -1; for (offset = 0; offset lnotab_len; offset += 2) { - addr += lnotab[offset]; - line += lnotab[offset+1]; + addr += ((unsigned char*)lnotab)[offset]; + line += ((unsigned char*)lnotab)[offset+1]; if (line = new_lineno) { new_lasti = addr; new_lineno = line; It would be nice to fix it for Python 2.5 and above, in order to have a proper MSI installer for Windows. Best regards, Fabien Bouleau DISCLAIMER: This e-mail contains proprietary information some or all of which may be legally privileged. It is for the intended recipient only. If an addressing or transmission error has misdirected this e-mail, please notify the author by replying to this e-mail. If you are not the intended recipient you must not use, disclose, distribute, copy, print, or rely on this e-mail. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] RELEASED Python 3.0 final FFT
http://code.activestate.com/recipes/576550/ This recipe shows how to use gsl FFT with python 3. ctypes is really good! ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] RELEASED Python 3.0 final
On Thu, 4 Dec 2008 22:05:05 -0800, Guido van Rossum [EMAIL PROTECTED] wrote: On Thu, Dec 4, 2008 at 9:40 PM, [EMAIL PROTECTED] wrote: The default case, the case of the user without the wherewithal to understand the nuances of the distinction between 2.x and 3.x, is a user who should use 2.x. Not at all clear. If they're not sensitive to those nuances it's just as likely that they're a casual developer (e.g. a student just learning to program). Such users are unlikely to start using major 3rd party packages like Twisted or Django, which would be completely overwhelming to someone just learning. That seems like it would be right to me, but two or three times a month someone shows up in the Twisted IRC channel who is learning both Python and Twisted at the same time. So apparently there are a lot of people for whom this isn't overwhelming. Jean-Paul ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] RELEASED Python 3.0 final
On Fri, Dec 5, 2008 at 12:35 AM, A.M. Kuchling [EMAIL PROTECTED] wrote: On Thu, Dec 04, 2008 at 05:29:31PM -0800, Raymond Hettinger wrote: Here's a bright idea. On the 3.0 release page, include a box listing which major third-party apps have been converted. Update it once every couple of weeks. That way, we're not explicitly That's an excellent idea. We could have a webpage, or start a topic-specific weblog for posting announcements. I've started a draft of a 3.0 FAQ in the wiki at http://wiki.python.org/moin/Python3000/FAQ. Once it's finished we can move it into the 3.0 release pages. Everyone please edit and improve it! Sometime ago I started a page on the wiki to collect reports of early migrations by the community: http://wiki.python.org/moin/Early2to3Migrations Maybe this would be relevant to point on the FAQ. --amk ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/eduardo.padoan%40gmail.com -- Eduardo de Oliveira Padoan http://djangopeople.net/edcrypt/ Distrust those in whom the desire to punish is strong. -- Goethe, Nietzsche, Dostoevsky ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Fix for frame_setlineno() in frameobject.c function
Hi, Please post this on the issue tracker. http://bugs.python.org On Fri, Dec 5, 2008 at 4:42 AM, [EMAIL PROTECTED] wrote: Hello, This concerns a known bug in the frame_setlineno() function for Python 2.5.x and 2.6.x (maybe in earlier version too). It is not possible to use this function when the address or line offset are greater than 127. The problem comes from the lnotab variable which is typed char*, therefore implicitely signed char*. Any value above 127 becomes a negative number. The fix is very simple (applied on the Python 2.6.1 version of the source code): --- frameobject.c Thu Oct 02 19:39:50 2008 +++ frameobject_fixed.c Fri Dec 05 11:27:42 2008 @@ -119,8 +119,8 @@ line = f-f_code-co_firstlineno; new_lasti = -1; for (offset = 0; offset lnotab_len; offset += 2) { - addr += lnotab[offset]; - line += lnotab[offset+1]; + addr += ((unsigned char*)lnotab)[offset]; + line += ((unsigned char*)lnotab)[offset+1]; if (line = new_lineno) { new_lasti = addr; new_lineno = line; -- Cheers, Benjamin Peterson There's nothing quite as beautiful as an oboe... except a chicken stuck in a vacuum cleaner. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python-3.0, unicode, and os.environ
On Dec 5, 2008, at 5:27 AM, Ulrich Eckhardt wrote: Using the byte variant is equally fubar, because e.g. on MS Windows it is not supported, except through a very lossy roundtrip through the locale's codepage, limiting your functionality. Yeah, IMO whole mess could have been avoided by keeping the filename/ args/environ simply *bytes*, like it really is, on unix. Then, make the Windows version of python use (always! not dependent upon locale!) utf-8 to decode the utf-8 bytestring to the UTF-16 that the Windows platform APIs expect (and vice versa). And never use the ASCII variant of the windows APIs. This would mean that all *inputs* would succeed, but some *outputs* would not, on Windows. But that's not a new kind of failure: NUL has never been allowed in argv/environ, and filenames have all sorts of platform-dependent restrictions. But unfortunately, it's too late for that solution... James ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python-3.0, unicode, and os.environ
Terry Reedy wrote: Toshio Kuratomi wrote: I would think life would be ultimately easier if either the file server or the shell server automatically translated file names from jis and utf8 and back, so that the PATH on the *nix shell server is entirely utf8. This is not possible because no part of the computer knows what the encoding is. To the computer, it's just a sequence of bytes. Unlike xml or the windows filesystem (winfs? ntfs?) where the encoding is specified as part of the document/filesystem there's nothing to tell what encoding the filenames are in. I thought you said that the file server keep all filenames in shift-jis, and the shell server all in utf-8. Yes. But this is part of the setup of the example to keep things simple. The fileserver or shell server could themselves be of mixed encodings (for instance, if it was serving home directories to users all over the world each user might be using a different encoding.) If so, then the shell server could know if it were told so. Where are you going to store that information? In order for python to run without errors, will it have to be configured on each system it's installed on to know the encoding of each filename? Or are we going to try to talk each *NIX vendor into creating new filesystems that record that information and after a five year span of time declare that python will not run on other filesystems in corner cases? I think that this way does not hold a reasonable expectation of keeping python a portable language. -Toshio signature.asc Description: OpenPGP digital signature ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Python security: draft article on the wiki
Hi, I started to write a short article about Python security on the wiki: http://wiki.python.org/moin/Security Nothing useful yet. -- Victor Stinner aka haypo http://www.haypocalc.com/blog/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] RELEASED Python 3.0 final
Martin There is. There have been the following trove classifiers Martin defined for a few weeks now: Martin Programming Language :: Python :: 2 Martin Programming Language :: Python :: 2.3 Martin Programming Language :: Python :: 2.4 Martin Programming Language :: Python :: 2.5 Martin Programming Language :: Python :: 2.6 Martin Programming Language :: Python :: 2.7 Martin Programming Language :: Python :: 3 Martin Programming Language :: Python :: 3.0 Martin Programming Language :: Python :: 3.1 Good. Now we just need to populate them. I take it the classifiers without minor numbers imply any known minor version (e.g., 2 == 2.3 and greater)? Skip ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] RELEASED Python 3.0 final
On Fri, Dec 05, 2008 at 05:40:46AM -, [EMAIL PROTECTED] wrote: For most users, especially new users who have yet to be impressed with Python's power, 2.x is much better. It's not like library support is one small check-box on the language's feature sheet: most of the attractive things about Python are libraries. Of course I am not free Here I agree, sort of. Newbies may not understand what they're giving up in terms of libraries. (The 'sort of' is because, having learned 3.0, learning the changes for 2.6 is certainly much easier than learning a first programming language is.) The third (albeit much less likely) option is that you're learning Python to learn to interact with a system that's scriptable in embedded Python, like Blender or Gimp. I don't think there's a single system of that variety which uses 3.0 yet, and these will likely be even slower to move than libraries. Let me note that if some application embeds Python for a specialized purpose, where the only modules imported are either user-written or part of the application, it seems much *easier* to move to Python 3 because the scripts don't use arbitrary third-party libraries. Python embedded in an e-mail MTA might use libraries for DNS or file I/O or databases and has to be cautious about versions; Python in Gimp probably doesn't, in practice. --amk ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python + Java Integration
One thing that would help Python in this debate (or, perhaps simply put it in the running, at least as a next Java candidate) would be if Python had an easier migration path for Java developers that currently rely upon various third-party libraries. The wealth of third-party libraries available for Java has always been one of its great strengths. Ergo, if Python had an easy-to-use, recommended way to use those libraries within the Python environment, that would be a significant advantage to present to Java developers and those who would choose Ruby over Java. Platform compatibility is always a huge motivator for those looking to migrate or upgrade. Personally, I'm using Andi Vajda's JCC for this purpose. Recommended. The nice thing about it is that it turns jar files into Python modules; you don't need the source. http://pypi.python.org/pypi/JCC Bill ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Summary of Python tracker Issues
ACTIVITY SUMMARY (11/28/08 - 12/05/08) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue number. Do NOT respond to this message. 2233 open (+55) / 14139 closed (+41) / 16372 total (+96) Open issues with patches: 753 Average duration of open issues: 705 days. Median duration of open issues: 2193 days. Open Issues Breakdown open 2214 (+54) pending19 ( +1) Issues Created Or Reopened (96) ___ Coding cookie crashes IDLE 11/28/08 CLOSED http://bugs.python.org/issue4454created tjreedy No Windows List in IDLE if several windows have the same title 11/28/08 CLOSED http://bugs.python.org/issue4455created amaury.forgeotdarc patch xmlrpc is broken 11/28/08 CLOSED http://bugs.python.org/issue4456created benjamin.peterson __import__ documentation obsolete11/29/08 http://bugs.python.org/issue4457created stevenjd getopt.gnu_getopt() loses dash argument 11/29/08 CLOSED http://bugs.python.org/issue4458created muntyan bdist_rpm assumes python 11/29/08 http://bugs.python.org/issue4459created John5342 The parameter of PyInt_AsSsize_t() is not checked to see if it i 11/29/08 CLOSED http://bugs.python.org/issue4460created CWRU_Researcher1 parameters of PyLong_FromString() are not checked for NULL 11/29/08 http://bugs.python.org/issue4461created CWRU_Researcher1 patch result of PyList_GetItem() not validated 11/29/08 CLOSED http://bugs.python.org/issue4462created CWRU_Researcher1 Parameters and result of PyList_GetItem() are not validated 11/29/08 CLOSED http://bugs.python.org/issue4463created CWRU_Researcher1 PyList_GetItem() result and parameters not fully validated 11/29/08 CLOSED http://bugs.python.org/issue4464created CWRU_Researcher1 The result of set_copy() is not checked for NULL 11/29/08 CLOSED http://bugs.python.org/issue4465created CWRU_Researcher1 The return value of PyFile_FromFile is not checked for NULL 11/29/08 CLOSED http://bugs.python.org/issue4466created CWRU_Researcher1 return value of PyUnicode_AsEncodedString() is not checked for N 11/29/08 CLOSED http://bugs.python.org/issue4467created CWRU_Researcher1 Restore chapter enumeration in Python docs 11/30/08 CLOSED http://bugs.python.org/issue4468created schluehk CVE-2008-5031 multiple integer overflows 11/30/08 http://bugs.python.org/issue4469created doko smtplib SMTP_SSL not working.11/30/08 http://bugs.python.org/issue4470created lcatucci patch IMAP4 missing support for starttls 11/30/08 http://bugs.python.org/issue4471created lcatucci patch Is shared lib building broken on trunk? 11/30/08 http://bugs.python.org/issue4472created skip.montanaro POP3 missing support for starttls
Re: [Python-Dev] Python-3.0, unicode, and os.environ
Victor Stinner wrote: Hi, Le Thursday 04 December 2008 21:02:19 Toshio Kuratomi, vous avez écrit : These mixed encodings can occur for a variety of reasons. Here's an example that isn't too contrived :-) (...) Furthermore, they don't want to suffer from the space loss of using utf-8 to encode Japanese so they use shift-jis everywhere. space loss? Really? If you configure your server correctly, you should get UTF-8 even if the file system is Shift-JIS. But it would be much easier to use UTF-8 everywhere. Hum... I don't think that the discussion is about one specific server, but the lack of bytes environment variables in Python3 :-) Yep. I can't change the logicalness of the policies of a different organization, only code my application to deal with it :-) 1) return mixed unicode and byte types in ... NO! It's nice that we agree... but I would prefer if you leave enough context so that others can see that we agree as well :-) 2) return only byte types in os.environ Hum... Most users have UTF-8 everywhere (eg. all Windows users ;-)), and Python3 already use Unicode everywhere (input(), open(), filenames, ...). We're also in agreement here. 3) silently ignore non-decodable value when accessing os.environ['PATH'] as we do now but allow access to the full information via os.environ[b'PATH'] and os.getenvb() I don't like os.environ[b'PATH']. I prefer to always get the same result type... But os.listdir() doesn't respect that :-( os.listdir(str) - list of str os.listdir(bytes) - list of bytes I would prefer a similar API for easier migration from Python2/Python3 (unicode). os.environb sounds like the best choice for me. nod. After thinking about how it would be used in subprocess calls I agree. os.environb would allow us to retrieve the full dict as bytes. os.environ[b''] only works on individual keys. Also os.getenv serves the same purpose as os.environ[b''] would whereas os.environb would have its own uses. But they are open questions (already asked in the bug tracker): I answered these in the bug tracker. Here are the answers for the mailing list: (a) Should os.environ be updated if os.environb is changed? If yes, how? os.environb['PATH'] = '\xff' (or any invalid string in the system default encoding) = os.environ['PATH'] = ??? The underlying environment that both variables reflect should be updated but what is displayed by os.environ should continue to follow the same rules. So if we follow option #3:: os.environb['PATH'] = b'\xff' os.environ['PATH'] = raises KeyError because PATH is not a key in the unicode decoded environment. (option #4 would issue a UnicodeDecodeError instead of a KeyError) Similarly, if you start with a variable in os.environb that can only be represented as bytes and your program transforms it into something that is decodable it should then show up in os.environ. (b) Should os.environb be updated if os.environ is changed? If yes, how? The problem comes with non-Unicode locale (eg. latin-1 or ASCII): most charset are unable to encode the whole Unicode charset (eg. codes = 65535). os.environ['PATH'] = chr(0x1) = os.environb['PATH'] = ??? Ah, this is a good question. I misunderstood what you were getting at when you posted this to the bug report. I see several options but the one that seems the most sane is to raise UnicodeEncodeError when setting the value. With that, proper code to set an environment variable might look like this:: LANG=C python3.0 variable = chr(0x1) try: # Unicode aware locales os.environ['MYVAR'] = variable except UnicodeEncodeError: # Non-Unicode locales os.environb['MYVAR'] = bytes(variable, encoding='utf8') (c) Same question when a key is deleted (del os.environ['PATH']). Update the underlying env so both os.environ and os.environb reflect the change. Deleting should not hold the problems that updating does. If Python 3.1 will have os.environ and os.environb, I'm quite sure that some modules will user os.environ and other will prefer os.environb. If both environments are differents, the two modules set will work differently :-/ Exactly. So making sure they hold the same information is a priority. It would be maybe easier if os.environ supports bytes and unicode keys. But we have to keep these assertions: os.environ[bytes] - bytes os.environ[str] - str I think the same choices have to be made here. If LANG=C, we still have to decide what to do when os.environ[str] is set to a non-ASCii string. Additionally, the subprocess question makes using the key value undesirable compared with having a separate os.environb that accesses the same underlying data. 4) raise an exception when non-decodable values are *accessed* and continue as in #3. I like os.listdir() behaviour: just *ignore* non-decodable files. If you really want to access these
Re: [Python-Dev] Python-3.0, unicode, and os.environ
On Fri, Dec 5, 2008 at 2:27 AM, Ulrich Eckhardt [EMAIL PROTECTED] wrote: Seriously, what would you suggest to someone that wants to handle paths in a portable way? Using the Unicode variants of functions is fubar, because encoding/decoding is not universally possible. Using the byte variant is equally fubar, because e.g. on MS Windows it is not supported, except through a very lossy roundtrip through the locale's codepage, limiting your functionality. Write a lightweight abstraction layer that uses Unicode when possible and bytes otherwise. You'd need to write a few functions for the path handling code you need, with a platform check or two sprinkled in. Writing such an abstraction for the purpose of one specific application is usually simple enough. However, writing a similar abstraction that serves all apps and all use cases is hard. I hope that eventually someone will come up with one though -- the failure of earlier path object proposals notwithstanding. I actually think it is about time to give up on trying to think about a path as a string. Dito for data received from os.environ or sys.argv. There are only very few things that are universal to them and a reliable encoding is none of them. Then, once you have let that idea go, meditate a bit over the Zen. This sounds too pessimistic to me. I expect that in five years it will be universally accepted that these variables must be encoded in a standard encoding. People are never going to give up thinking about filenames etc. as strings, because that's what they are conceptually. The problem is purely one of encoding, and that's where Unix/Linux are behind the curve, since (so far) they haven't taken the plunge and picked a universal standard encoding, the way Windows and Mac OS X have done. What I propose is that paths must be treated as OS-specific, with the only common reliable operations being joining them, concatenating them and splitting them into segments divided by the (again, OS-specific) separator. Other operations, like e.g. appending a string or converting it to a string in order to display it can fail. And if they fail, they should fail noisily. That's bad though, since filenames are being displayed all the time (e.g. in error messages). In 99% of all cases, using the default encoding will work and do what people expect, which is why I would make this conversion automatic. In all other cases, it will at least not fail silently (which would lead to garbage and data loss) and allow more sophisticated applications to handle it. I think the always fail noisily approach isn't the best approach. E.g. if I am globbing for *.py, and there's an undecodable .txt file in a directory, its presence shouldn't cause the glob to fail. -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] RELEASED Python 3.0 final
On Dec 4, 2008, at 7:59 PM, [EMAIL PROTECTED] wrote: On 02:35 am, [EMAIL PROTECTED] wrote: On Thu, Dec 04, 2008 at 05:29:31PM -0800, Raymond Hettinger wrote: Here's a bright idea. On the 3.0 release page, include a box listing which major third-party apps have been converted. Update it once every couple of weeks. That way, we're not explicitly That's an excellent idea. We could have a webpage, or start a topic-specific weblog for posting announcements. I've started a draft of a 3.0 FAQ in the wiki at http://wiki.python.org/moin/Python3000/FAQ. Once it's finished we can move it into the 3.0 release pages. Everyone please edit and improve it! It occurs to me that this specific idea (the box with the list of supported applications / libraries) should be implementable as a simple query against PyPI. I don't know if it actually is :), but it should be. In general it would be nice to know whether one's favorite tools were available for *any* new Python version. I agree with this. Plus it might act as an incentive for people to port libraries faster... Ted ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] RELEASED Python 3.0 final
On Thu, Dec 4, 2008 at 11:27 PM, [EMAIL PROTECTED] wrote: With all due respect, for me, library support and serious use are synonymous. Glyph, I cannot have a discussion with you if every single post of yours is longer than my combined daily output. Please spend some time writing shorter posts. I'm sure I'm not the only one here with a short attention span. :-) -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] RELEASED Python 3.0 final
On Dec 5, 2008, at 10:25 AM, [EMAIL PROTECTED] wrote: Good. Now we just need to populate them. I take it the classifiers without minor numbers imply any known minor version (e.g., 2 == 2.3 and greater)? This is an excellent question, Skip. There was already Programming Language :: Python, provided by many packages. I think version compatibility relationships meant by each of these classifiers should be made explicit, wherever it is that documentation for classifiers is provided. I don't recall having seen any such documentation; hopefully I just need to be hit by another clue. -Fred -- Fred Drake fdrake at acm.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] __import__ docs follow-up
Hi, as a follow-up to the thread a few days ago, and the bug report, I've rewritten most of the __import__ docs. I've attached the suggested patch to the issue http://bugs.python.org/issue4457. I'd be glad for reviews. Also, I'd like to ask about opinions if this winning idiom (as a bug comment states) should be in it, instead of the getattr() helper function: import sys __import__('x.y.z') mod = sys.modules['x.y.z'] Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] ANN: new python-porting mailing list
Hi all, to facilitate discussion about porting Python code between different versions (mainly of course from 2.x to 3.x), we've created a new mailing list [EMAIL PROTECTED] It is a public mailing list open to everyone. We expect active participation of many people porting their libraries/programs, and hope that the list can be a help to all wanting to go this (not always smooth :-) way. @python-dev: it would of course be nice to have more than a few developers on that list ;-) regards, Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Merging flow
On Thu, Dec 4, 2008 at 3:12 PM, Christian Heimes [EMAIL PROTECTED] wrote: Flow diagram trunk --- release26-maint \- py3k --- release30-maint I'm running into problems making this work, with a trivial change: I committed r67590 (which adds a single assert to ast.c) to the trunk, then merged to 2.6 and py3k in r67592 and r67595 respectively. Then I tried: ../svnmerge.py merge -r67595 from the root directory of a clean copy of the release30-maint branch (svn status gives no output), and got conflicts on '.': property 'svnmerge-integrated' set on '.' property 'svnmerge-blocked' set on '.' --- Merging r67595 into '.': UPython/ast.c C . property 'svnmerge-integrated' set on '.' property 'svnmerge-blocked' deleted from '.'. I now have a new file dir_conflicts.prej that looks something like: Trying to change property 'svnmerge-integrated' from '/python/trunk:1-61437,...,67528,67590', but property has been locally changed from '/python/branches/py3k:1-67498,67522-67524,67539,67541,67559,67588' to '/python/trunk:1-61437,...,67467,67484,67528'. (where the ... abbreviates a big long list of revision numbers). Did I mess up somewhere, or does svnmerge not work on a revision that was itself the result of an svnmerge? Mark ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] ANN: new python-porting mailing list
On Fri, Dec 5, 2008 at 10:36, Georg Brandl [EMAIL PROTECTED] wrote: Hi all, to facilitate discussion about porting Python code between different versions (mainly of course from 2.x to 3.x), we've created a new mailing list [EMAIL PROTECTED] It is a public mailing list open to everyone. We expect active participation of many people porting their libraries/programs, and hope that the list can be a help to all wanting to go this (not always smooth :-) way. The mailing list URL is http://mail.python.org/mailman/listinfo/python-porting for those who don't want to search on the mail.python.org home page (which looks really dated at this point). -Brett ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Merging flow
On Fri, Dec 5, 2008 at 11:20, Mark Dickinson [EMAIL PROTECTED] wrote: On Thu, Dec 4, 2008 at 3:12 PM, Christian Heimes [EMAIL PROTECTED] wrote: Flow diagram trunk --- release26-maint \- py3k --- release30-maint I'm running into problems making this work, with a trivial change: I committed r67590 (which adds a single assert to ast.c) to the trunk, then merged to 2.6 and py3k in r67592 and r67595 respectively. Then I tried: ../svnmerge.py merge -r67595 from the root directory of a clean copy of the release30-maint branch (svn status gives no output), and got conflicts on '.': property 'svnmerge-integrated' set on '.' property 'svnmerge-blocked' set on '.' --- Merging r67595 into '.': UPython/ast.c C . property 'svnmerge-integrated' set on '.' property 'svnmerge-blocked' deleted from '.'. I now have a new file dir_conflicts.prej that looks something like: Trying to change property 'svnmerge-integrated' from '/python/trunk:1-61437,...,67528,67590', but property has been locally changed from '/python/branches/py3k:1-67498,67522-67524,67539,67541,67559,67588' to '/python/trunk:1-61437,...,67467,67484,67528'. (where the ... abbreviates a big long list of revision numbers). Did I mess up somewhere, or does svnmerge not work on a revision that was itself the result of an svnmerge? Someone might know better than me, but I am willing to bet you can't svnmerge a svnmerge revision. Since the svnmerge revision contains changes to the metadata on . that will conflict with the new svnmerge values that the svnmerge you are trying to do causes. But if I am right about this then won't that require blocking the svnmerge revision on release30-maint the svnmerge revision on py3k? Ugh. Is this getting to the point that we can only svnmerge between trunk and py3k and the maintenance branches just have to be managed the old-fashion way? And I have pinged the people helping me with the DVCS PEP in hopes of getting us moved off of svn sooner rather than later. -Brett ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Merging flow
On Dec 5, 2008, at 2:20 PM, Mark Dickinson wrote: Did I mess up somewhere, or does svnmerge not work on a revision that was itself the result of an svnmerge? I ran into this yesterday as well with my patch to the cgi module. The work-around was to revert the change to that property and edit it manually. I think this is a significant issue, since editing that property is about as error-prone as it can be. I've not really looked at the code in svnmerge.py, so I'm not sure how hard it would be to fix. -Fred -- Fred Drake fdrake at acm.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] RELEASED Python 3.0 final
[EMAIL PROTECTED] schrieb: To be fair, if someone asked me specifically about educating non- programmer adults about programming, I would probably at least *mention* py3, if not recommend it outright. The improved consistency is worth a lot in an educational setting. (But, if one is educating children and interested in soliciting their genuine enthusiasm, whiz-bang graphics are really a must-have, not a negotiable extra.) As a non native English speaker I'm not sure if I understand correctly, what you mean with whiz-bang graphics. Nevertheless I'd like to point you to the new turtle graphics module (which is part of the standard librarys since 2.6). At least it was designed especially for use in the educational domain. Moreover the source-distribution also contains a bunch of some ten example scripts. Regards, Gregor ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] ANN: new python-porting mailing list
Georg[EMAIL PROTECTED] Georg It is a public mailing list open to everyone. We expect active Georg participation of many people porting their libraries/programs, Georg and hope that the list can be a help to all wanting to go this Georg (not always smooth :-) way. I trust you will announce this in python-list and python-announce-list if you haven't already? Skip ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] ANN: new python-porting mailing list
[EMAIL PROTECTED] schrieb: Georg[EMAIL PROTECTED] Georg It is a public mailing list open to everyone. We expect active Georg participation of many people porting their libraries/programs, Georg and hope that the list can be a help to all wanting to go this Georg (not always smooth :-) way. I trust you will announce this in python-list and python-announce-list if you haven't already? I've sent it to python-announce, it's in the moderator queue. I'm not on python-list so I can't answer followups. If you'd like to do an announcement there, I'd be happy :) Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] RELEASED Python 3.0 final
On 5-Dec-08, at 8:40 AM, A.M. Kuchling wrote: On Fri, Dec 05, 2008 at 05:40:46AM -, [EMAIL PROTECTED] wrote: For most users, especially new users who have yet to be impressed with Python's power, 2.x is much better. It's not like library support is one small check-box on the language's feature sheet: most of the attractive things about Python are libraries. Of course I am not free Here I agree, sort of. Newbies may not understand what they're giving up in terms of libraries. (The 'sort of' is because, having learned 3.0, learning the changes for 2.6 is certainly much easier than learning a first programming language is.) For possible insight, here is a current discussion on the topic: http://www.reddit.com/r/programming/comments/7hlra/ask_progit_ive_got_the_itch_to_learn_python_since/ (note that these would be programmers interested in learning python, not people trying to learn programming) -Mike ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python-3.0, unicode, and os.environ
Guido van Rossum wrote: On Fri, Dec 5, 2008 at 2:27 AM, Ulrich Eckhardt [EMAIL PROTECTED] wrote: In 99% of all cases, using the default encoding will work and do what people expect, which is why I would make this conversion automatic. In all other cases, it will at least not fail silently (which would lead to garbage and data loss) and allow more sophisticated applications to handle it. I think the always fail noisily approach isn't the best approach. E.g. if I am globbing for *.py, and there's an undecodable .txt file in a directory, its presence shouldn't cause the glob to fail. But why should it make glob() fail? This sounds like an implementation detail of glob. Here's some pseudo-code:: def glob(pattern): string = False if isinstance(pattern, str): string = True if platform == 'POSIX': pattern = bytes(pattern, encoding=defaultencoding) rawfiles = os.listdir(os.path.dirname(pattern) or pattern) if string and platform == 'POSIX': return [str(f) for f in rawfiles if match(f, pattern)] else: return rawfiles This way the traceback occurs if anything in the result set is undecodable. What am I missing? -Toshio signature.asc Description: OpenPGP digital signature ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python-3.0, unicode, and os.environ
On Fri, Dec 5, 2008 at 12:05 PM, Toshio Kuratomi [EMAIL PROTECTED] wrote: Guido van Rossum wrote: On Fri, Dec 5, 2008 at 2:27 AM, Ulrich Eckhardt [EMAIL PROTECTED] wrote: In 99% of all cases, using the default encoding will work and do what people expect, which is why I would make this conversion automatic. In all other cases, it will at least not fail silently (which would lead to garbage and data loss) and allow more sophisticated applications to handle it. I think the always fail noisily approach isn't the best approach. E.g. if I am globbing for *.py, and there's an undecodable .txt file in a directory, its presence shouldn't cause the glob to fail. But why should it make glob() fail? This sounds like an implementation detail of glob. Glob was just an example. Many use cases for directory traversal couldn't care less if they see *all* files. -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python-3.0, unicode, and os.environ
Guido van Rossum wrote: Glob was just an example. Many use cases for directory traversal couldn't care less if they see *all* files. Okay. Makes it harder to prove correct or not if I don't know what the use case is :-) I can't think of a single use case off-hand. Even your example of a ??.txt file making retrieval of *.py files fail is a little broken. If there was a ??.py file that was undecodable the program would most likely want to know that file existed. -Toshio signature.asc Description: OpenPGP digital signature ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] RELEASED Python 3.0 final
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Gregor Lingl wrote: [EMAIL PROTECTED] schrieb: To be fair, if someone asked me specifically about educating non- programmer adults about programming, I would probably at least *mention* py3, if not recommend it outright. The improved consistency is worth a lot in an educational setting. (But, if one is educating children and interested in soliciting their genuine enthusiasm, whiz-bang graphics are really a must-have, not a negotiable extra.) As a non native English speaker I'm not sure if I understand correctly, what you mean with whiz-bang graphics. Nevertheless I'd like to point you to the new turtle graphics module (which is part of the standard librarys since 2.6). At least it was designed especially for use in the educational domain. Moreover the source-distribution also contains a bunch of some ten example scripts. I'm pretty sure he that turtle graphics are not whiz-bang (in this century, at least). Begin able to do pygame-style OpenGL stuff would be whiz bang[1] in my book. [1] http://www.merriam-webster.com/dictionary/whizbang Tres. - -- === Tres Seaver +1 540-429-0999 [EMAIL PROTECTED] Palladion Software Excellence by Designhttp://palladion.com -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFJOZPn+gerLs4ltQ4RAnE1AKCl+Z51tACSJLBmAOcp5q534Mx+2ACg1I28 re6gaV7AFEU0WS1yvUIiZS0= =4Pda -END PGP SIGNATURE- ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python-3.0, unicode, and os.environ
Guido van Rossum wrote: At the risk of bringing up something that was already rejected, let me propose something that follows the path taken in 3.0 for filenames, rather than doubling back: For os.environ, os.getenv() and os.putenv(), I think a similar approach as used for os.listdir() and os.getcwd() makes sense: let os.environ skip variables whose name or value is undecodable, and have a separate os.environb() which contains bytes; let os.getenv() and os.putenv() do the right thing when the arguments passed in are bytes. I prefer the method used by file.read() where an error is thrown when accessing undecodable data. I think in time python programmers will consider not throwing an exception a wart in python3. However, this is enough to allow programmers to do the right thing once an error is reported by users and the cause has been tracked down so it doesn't block fixing errors as the current code does. And it's not like anyone expected python3 to be wart-free just because the python2 warts were fixed ;-) For sys.argv, because it's positional, you can't skip undecodable values, so I propose to use error=replace for the decoding; again, we can add sys.argvb that contains the raw bytes values. The various os.exec*() and os.spawn*() calls (as well as os.system(), os.popen() and the subprocess module) should all accept bytes as well as strings. This also seems sane with the same comment about throwing errors. -Toshio signature.asc Description: OpenPGP digital signature ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python-3.0, unicode, and os.environ
Hi, But they are open questions (already asked in the bug tracker): I answered these in the bug tracker. Here are the answers for the mailing list: Oh, sorry. I didn't follow the end of the discussion on the bug tracker. os.environb['PATH'] = '\xff' = os.environ['PATH'] = ??? os.environ['PATH'] = raises KeyError because PATH is not a key in the unicode decoded environment. Ok, good answer :-) os.environ['PATH'] = chr(0x1) = os.environb['PATH'] = ??? raise UnicodeEncodeError when setting the value. Ok, it's consistent the current behaviour. $ LANG=C ./python Python 3.0rc3+ (py3k:67498M, Dec 4 2008, 17:45:54) import os os.environ['x'] = '\xff' os.environ['x'] Traceback (most recent call last): File stdin, line 1, in module File /home/haypo/prog/py3k/Lib/io.py, line 1491, in write b = encoder.encode(s) File /home/haypo/prog/py3k/Lib/encodings/ascii.py, line 22, in encode return codecs.ascii_encode(input, self.errors)[0] UnicodeEncodeError: 'ascii' codec can't encode character '\xff' in position 1: ordinal not in range(128) Oh, that's strange :-p The error is delayed when we read the value. It would be maybe easier if os.environ supports bytes and unicode keys. But we have to keep these assertions: os.environ[bytes] - bytes os.environ[str] - str I think the same choices have to be made here. If LANG=C, we still have to decide what to do when os.environ[str] is set to a non-ASCii string. If the charset is US-ASCII, os.environ will drop non-ASCII values. But most variables are ASCII only. Examples with my shell: $ env XCURSOR_THEME=kubuntu LANG=fr_FR.UTF-8 EDITOR=vim HOME=/home/haypo ... Additionally, the subprocess question makes using the key value undesirable compared with having a separate os.environb that accesses the same underlying data. The user should be able to choose bytes or unicode. Examples: - subprocess.Popen('ls') = use unicode environment (os.environ) - subprocess.Popen(b'ls') = use bytes environment (os.environb) Here's my problem with it, though. With these semantics any program that works on arbitrary files and runs on *NIX has to check os.listdir(b'') and do the conversion manually. Only programs that have to support strange environment like yours (mixing Shift-JIS and UTF-8) :-) Most programs don't have to support these charset mixture. We can imagine an higher library working on UNIX and Windows (bytes or Unicode). But that would be later. I think the desired behaviour assuming the existence of a nondecodable file is this: I prefer the current behaviour :-) Why do you think that glob.glob('*.py') is special and should not traceback? It's not special. glob() reuses listdir(), and it was an example to show that it just works. I just differ in that I think lack of tracebacks when UnicodeDecodeErrors are encountered is a wart in python3 that did not exist in python2. Right. -- Victor Stinner aka haypo http://www.haypocalc.com/blog/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python-3.0, unicode, and os.environ
Toshio Kuratomi wrote: Guido van Rossum wrote: Glob was just an example. Many use cases for directory traversal couldn't care less if they see *all* files. Okay. Makes it harder to prove correct or not if I don't know what the use case is :-) I can't think of a single use case off-hand. Even your example of a ??.txt file making retrieval of *.py files fail is a little broken. If there was a ??.py file that was undecodable the program would most likely want to know that file existed. Why? Most programs won't be able to do anything with it. And if the program *can* do something with it... that's what the bytes version of the APIs are for. Cheers, Nick. -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] __import__ docs follow-up
Georg Brandl wrote: Hi, as a follow-up to the thread a few days ago, and the bug report, I've rewritten most of the __import__ docs. I've attached the suggested patch to the issue http://bugs.python.org/issue4457. I'd be glad for reviews. Also, I'd like to ask about opinions if this winning idiom (as a bug comment states) should be in it, instead of the getattr() helper function: import sys __import__('x.y.z') mod = sys.modules['x.y.z'] That way is a lot cleaner than other mechanisms I've seen (including the current mechanism in the docs). Making that the recommended way of doing a dynamic import seems like a good idea to me. Cheers, Nick. -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python-3.0, unicode, and os.environ
Victor Stinner wrote: It would be maybe easier if os.environ supports bytes and unicode keys. But we have to keep these assertions: os.environ[bytes] - bytes os.environ[str] - str I think the same choices have to be made here. If LANG=C, we still have to decide what to do when os.environ[str] is set to a non-ASCii string. If the charset is US-ASCII, os.environ will drop non-ASCII values. But most variables are ASCII only. Examples with my shell: Yes. But you still have the question of what to do when: os.environ[str] = chr(0x1) So I don't think it makes things simpler than having separate os.environ and os.environb that update the same data behind the scenes. Additionally, the subprocess question makes using the key value undesirable compared with having a separate os.environb that accesses the same underlying data. The user should be able to choose bytes or unicode. Examples: the subprocess question was posed further up the thread as basically -- does the user need to access os.environb in order to override things in the environment when calling subprocess? I think the answer to that is yes since you might want to start with your environment and modify it slightly when you call programs via subprocess. If you just try to copy os.environ and os.environ only iterates through the decodable env vars, that doesn't work. If you have an os.environb to copy it becomes possible. - subprocess.Popen('ls') = use unicode environment (os.environ) - subprocess.Popen(b'ls') = use bytes environment (os.environb) That's... not expected to me :-( If I never touch os.environ and invoke subprocess the normal way, I'd still expect the whole environment to be passed on to the program being called. This is how invoking programs manually, shell scripting, invoking programs from perl, python2, etc work. Also, it's not really a good fit with the other things that key off of the initial argument. os.listdir(b'.') changes the output to bytes. subprocess.Popen(b'ls') would change what environment gets input into the call. Here's my problem with it, though. With these semantics any program that works on arbitrary files and runs on *NIX has to check os.listdir(b'') and do the conversion manually. Only programs that have to support strange environment like yours (mixing Shift-JIS and UTF-8) :-) Most programs don't have to support these charset mixture. Any program that is intended to be distributed, accesses arbitrary files, and works on *nix platforms needs to take this into account. Just because the environment inside of my organization is sane doesn't mean that when we release the code to customers, clients, or the free software community that the places it runs will be as strict about these things. Are most programs specific to one organization or are they distributed to other people? I can't answer that... everything I work on (except passwords:-) is distributed -- from sys admin cronjobs to web applications since I'm lucky that my whole job is devoted to working on free software. -Toshio signature.asc Description: OpenPGP digital signature ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Merging flow
Fred Drake wrote: On Dec 5, 2008, at 2:20 PM, Mark Dickinson wrote: Did I mess up somewhere, or does svnmerge not work on a revision that was itself the result of an svnmerge? I ran into this yesterday as well with my patch to the cgi module. The work-around was to revert the change to that property and edit it manually. I think this is a significant issue, since editing that property is about as error-prone as it can be. I've not really looked at the code in svnmerge.py, so I'm not sure how hard it would be to fix. I think we're discovering the real reasons why people generally prefer to use a DVCS when trying to manage multiple branches :P For now it looks like we might have to maintain 3.0 manually, with svnmerge only helping out for trunk-2.6 and trunk-py3k... Cheers, Nick. -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python-3.0, unicode, and os.environ
Toshio Kuratomi wrote: Are most programs specific to one organization or are they distributed to other people? The former. That's pretty well documented in assorted IT literature ('shrink-wrap' and open source commodity software are still relatively new players on the scene that started to shift the balance the other way, but now the server side elements of web services are shifting it back again). Cheers, Nick. -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Merging flow
Nick Coghlan wrote: I think we're discovering the real reasons why people generally prefer to use a DVCS when trying to manage multiple branches :P For now it looks like we might have to maintain 3.0 manually, with svnmerge only helping out for trunk-2.6 and trunk-py3k... The problem seems to be trunk - py3k - 3.0. I had no issues with py3k - 3.0. Christian ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python-3.0, unicode, and os.environ
Nick Coghlan wrote: Toshio Kuratomi wrote: Are most programs specific to one organization or are they distributed to other people? The former. That's pretty well documented in assorted IT literature ('shrink-wrap' and open source commodity software are still relatively new players on the scene that started to shift the balance the other way, but now the server side elements of web services are shifting it back again). Cool. So it's only people writing code to be shared with the larger community or written for multiple customers that are affected by bugs like this. :-/ -Toshio signature.asc Description: OpenPGP digital signature ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python-3.0, unicode, and os.environ
Nick Coghlan wrote: Toshio Kuratomi wrote: Guido van Rossum wrote: Glob was just an example. Many use cases for directory traversal couldn't care less if they see *all* files. Okay. Makes it harder to prove correct or not if I don't know what the use case is :-) I can't think of a single use case off-hand. Even your example of a ??.txt file making retrieval of *.py files fail is a little broken. If there was a ??.py file that was undecodable the program would most likely want to know that file existed. Why? Most programs won't be able to do anything with it. And if the program *can* do something with it... that's what the bytes version of the APIs are for. Nonsense. A program can do tons of things with a non-decodable filename. Where it's limited is non-decodable filedata. For instance, if you have a graphical text editor, you need to let the user select files to load. To do that you need to list all the files in a directory, even the ones that aren't decodable. The ones that aren't decodable need to substitute something like: str(filename, errors='replace') + '(Filename not encoded in UTF8)' in the file listing that the user sees. When the file is loaded, it needs to access the actual raw filename. The file can then be loaded and operated upon and even saved back to disk using the raw, undecodable filename. If you have a file manager, you need to code something that let's the user move the file around. Once again, the program loads the raw filenames. It transforms the name into something representable to the user. It displays that. The user selects it and asks that it be moved to another location. Then the program uses the raw filename to move from one location to another. If you have a backup program, you need to list all the files in a directory. Then you need to copy those files to another location. Once again you have to retrieve the byte version of any non-decodable filenames. -Toshio signature.asc Description: OpenPGP digital signature ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Merging flow
On Dec 5, 2008, at 5:31 PM, Nick Coghlan wrote: I think we're discovering the real reasons why people generally prefer to use a DVCS when trying to manage multiple branches :P Really? I don't. The issue has nothing to do with someone maintaining private change sets, or wanting to do development with local commits without having access to commit to the project. I expect (and someone from work has said they do as well) that Subversion 1.5's merge tracking would have handled this situation. For now it looks like we might have to maintain 3.0 manually, with svnmerge only helping out for trunk-2.6 and trunk-py3k... I don't know if I'll have time to look at svnmerge this weekend (with house guests and all), but I really don't expect it's a difficult problem to solve in the tool. The behavior suggests that this tiered set of branch relationships wasn't expected. -Fred -- Fred Drake fdrake at acm.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Merging flow
Nick Coghlan wrote: For now it looks like we might have to maintain 3.0 manually, with svnmerge only helping out for trunk-2.6 and trunk-py3k Does it make the bookkeeping horrible if you merge from trunk straight to 3.0, and then blocked svnmerged changes from propagating? -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] RELEASED Python 3.0 final
Good. Now we just need to populate them. I take it the classifiers without minor numbers imply any known minor version (e.g., 2 == 2.3 and greater)? Perhaps. As usual, they mean what people use them for. I intended them to mean 2.x and 3.x, respectively, with no constraint on x (i.e. including possibly 2.0 and 2.1). In particular, presence of 2 and absence of 3 is meant to indicate I know that it won't work on Python 3. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python-3.0, unicode, and os.environ
On Fri, 5 Dec 2008 at 12:11, Guido van Rossum wrote: On Fri, Dec 5, 2008 at 12:05 PM, Toshio Kuratomi [EMAIL PROTECTED] wrote: Guido van Rossum wrote: On Fri, Dec 5, 2008 at 2:27 AM, Ulrich Eckhardt [EMAIL PROTECTED] wrote: In 99% of all cases, using the default encoding will work and do what people expect, which is why I would make this conversion automatic. In all other cases, it will at least not fail silently (which would lead to garbage and data loss) and allow more sophisticated applications to handle it. I think the always fail noisily approach isn't the best approach. E.g. if I am globbing for *.py, and there's an undecodable .txt file in a directory, its presence shouldn't cause the glob to fail. But why should it make glob() fail? This sounds like an implementation detail of glob. Glob was just an example. Many use cases for directory traversal couldn't care less if they see *all* files. I agree with Toshio. The only use case I can think of for not seeing all files is when selecting a subset, and if the thing that does the selecting only generates a traceback if a file that falls into the subset is undecodable, then I don't see a problem. That is, if I'm selecting a subset of the files in a directory, and one of that subset is undecodable, I _want_ a traceback, because I'll be wanting _all_ of the files that match my selection criteria.(*) So I'm curious to hear your use cases where undecodable files are don't care. (*) More specifically, I want the program of a developer who didn't think about the fact that users might have files with undecodable filenames in their directory to generate a traceback rather than silently losing those files. (This is spoken to both by the principle of least surprise and the zen rule that errors should never pass silently :) --RDM ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python-3.0, unicode, and os.environ
Toshio Kuratomi wrote: Nick Coghlan wrote: Toshio Kuratomi wrote: Guido van Rossum wrote: Glob was just an example. Many use cases for directory traversal couldn't care less if they see *all* files. Okay. Makes it harder to prove correct or not if I don't know what the use case is :-) I can't think of a single use case off-hand. Even your example of a ??.txt file making retrieval of *.py files fail is a little broken. If there was a ??.py file that was undecodable the program would most likely want to know that file existed. Why? Most programs won't be able to do anything with it. And if the program *can* do something with it... that's what the bytes version of the APIs are for. Nonsense. A program can do tons of things with a non-decodable filename. Where it's limited is non-decodable filedata. You can't display a non-decodable filename to the user, hence the user will have no idea what they're working on. Non-filesystem related apps have no business trying to deal with insane filenames. Linux is moving towards a standard of UTF-8 for filenames, and once we get to the point where the idea of encoding filenames and environment variables any other way is seen as crazy, then the Python 3 approach will work seamlessly. In the meantime, raw bytes APIs will provide an alternative for those that disagree with that philosophy. Cheers, Nick. -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] RELEASED Python 3.0 final
On Fri, Dec 5, 2008 at 19:10, Guido van Rossum [EMAIL PROTECTED] wrote: On Thu, Dec 4, 2008 at 11:27 PM, [EMAIL PROTECTED] wrote: With all due respect, for me, library support and serious use are synonymous. Glyph, I cannot have a discussion with you if every single post of yours is longer than my combined daily output. Please spend some time writing shorter posts. I'm sure I'm not the only one here with a short attention span. :-) Allow me to paraphrase glyph (with whom I'm in complete agreement, for what it's worth): many newbies will be disappointed by Python if they start with Python 3.0 and discover that most of the cool possibilities they had heard about are 'being worked on' and not quite ready. I don't doubt that 3.0 will be easier for the new programmer to learn, but I do not believe the average Oh, I heard about Python, let's learn it person should be pointed to 3.0 right now. They should be encouraged to learn 2.6 -- or even 2.5. In spite of Python being a programming language, there is a difference between 'casual user of the language' and 'library developer'; 3.0 is certainly a must for all actual library developers, and I'm sure most of them know about 3.0 by now. We're talking about first impressions for people without that knowledge. -- Thomas Wouters [EMAIL PROTECTED] Hi! I'm a .signature virus! copy me into your .signature file to help me spread! ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python-3.0, unicode, and os.environ
On Fri, Dec 5, 2008 at 18:48, Nick Coghlan [EMAIL PROTECTED] wrote: Toshio Kuratomi wrote: Nick Coghlan wrote: Toshio Kuratomi wrote: Guido van Rossum wrote: Glob was just an example. Many use cases for directory traversal couldn't care less if they see *all* files. Okay. Makes it harder to prove correct or not if I don't know what the use case is :-) I can't think of a single use case off-hand. Even your example of a ??.txt file making retrieval of *.py files fail is a little broken. If there was a ??.py file that was undecodable the program would most likely want to know that file existed. Why? Most programs won't be able to do anything with it. And if the program *can* do something with it... that's what the bytes version of the APIs are for. Nonsense. A program can do tons of things with a non-decodable filename. Where it's limited is non-decodable filedata. You can't display a non-decodable filename to the user, hence the user will have no idea what they're working on. Non-filesystem related apps have no business trying to deal with insane filenames. And what of python's batteries---does a library that takes filenames or directories from a controlling program and processes the contents of the file need to care whether the file can be encoded properly? Is said library filesystem related or not? Won't it be awful when it's the directory name, and processing the file works if you change into its directory, but not if you're outside of it? And if there's an error during processing and the library reports a full filename using os.abspath(file.ext), but cannot get the results? Linux is moving towards a standard of UTF-8 for filenames, and once we get to the point where the idea of encoding filenames and environment variables any other way is seen as crazy, then the Python 3 approach will work seamlessly. In the meantime, raw bytes APIs will provide an alternative for those that disagree with that philosophy. And until that time, it's agony for the library writers who didn't think they needed to care, but find that their users (other developers) do. -- Michael Urman ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python-3.0, unicode, and os.environ
On Sat, 6 Dec 2008 09:18:47 am Nick Coghlan wrote: Toshio Kuratomi wrote: Guido van Rossum wrote: Glob was just an example. Many use cases for directory traversal couldn't care less if they see *all* files. Okay. Makes it harder to prove correct or not if I don't know what the use case is :-) I can't think of a single use case off-hand. Even your example of a ??.txt file making retrieval of *.py files fail is a little broken. If there was a ??.py file that was undecodable the program would most likely want to know that file existed. Why? Most programs won't be able to do anything with it. But the program can report a sensible error message, so the user can fix the problem. I'd rather have the Python API report errors then silence them, at least by default. I don't suppose it's on the table for functions to grow an extra argument that tells them to skip broken file names and environment variables? What I have in mind is something like: os.listdir(path, silence_errors=False) - list_of_strings By default, if a filename in path is not a valid string, an exception is raised, with the guilty file name given in bytes as an attribute of the exception. If silence_errors is true, the invalid file names are silently skipped. -- Steven ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python-3.0, unicode, and os.environ
Toshio Kuratomi wrote: Nick Coghlan wrote: Toshio Kuratomi wrote: Are most programs specific to one organization or are they distributed to other people? The former. That's pretty well documented in assorted IT literature ('shrink-wrap' and open source commodity software are still relatively new players on the scene that started to shift the balance the other way, but now the server side elements of web services are shifting it back again). Cool. So it's only people writing code to be shared with the larger community or written for multiple customers that are affected by bugs like this. :-/ True, but it's still a fairly important problem to have a solution to. Even internally in large organisations there can be some pretty insane environments as cruft accumulates over the years. Cheers, Nick. -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] RELEASED Python 3.0 final
There was already Programming Language :: Python, provided by many packages. I think version compatibility relationships meant by each of these classifiers should be made explicit, wherever it is that documentation for classifiers is provided. I don't recall having seen any such documentation; hopefully I just need to be hit by another clue. There is no documentation for classifiers whatsoever. I don't think nuances matter much, anyway. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python-3.0, unicode, and os.environ
5) represent all environment variables in Unicode strings, including the ones that currently fail to decode. (then do the same to file names, then drop the byte-oriented file operations again) Please, don't do that! Bytes are not characters! And environment variables, command line arguments, and file names are not bytes, but characters. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python-3.0, unicode, and os.environ
On Dec 5, 2008, at 7:48 PM, Nick Coghlan wrote: You can't display a non-decodable filename to the user, hence the user will have no idea what they're working on. Non-filesystem related apps have no business trying to deal with insane filenames. Sigh, same arguments, all over again. Again, *both* KDE and Gnome apps display non-decodable filenames to the user, and let the user work with the files. They display as good a rendition as they can, using a replacement character as appropriate. In some earlier versions, KDE did not work at all on poorly-encoded files, and, users submitted bug reports. People do care, it does happen in real life, and it is a bug in your software if you cannot deal with the users' files. They just want the software to work. If it shows something weird in the window titlebar, that's a bit irritating but at least it doesn't get in the way of working. Linux is moving towards a standard of UTF-8 for filenames, and once we get to the point where the idea of encoding filenames and environment variables any other way is seen as crazy, then the Python 3 approach will work seamlessly. I seriously doubt that would ever enforce utf-8 filenames/env vars/ command arguments. Oddly encoded strings will always be with us in some form or another. Now, perhaps you use crontab? At least on the systems I have, programs run by cron don't have any locale environment variables set, and so default to the C locale. So utf-8 encoded filenames/etc will fail, by default, for any python3 program run under cron. I'd like to make an analogy: what if Python3 couldn't deal with filenames with spaces in them on unix? Most filenames don't have spaces in them, so it should be okay, right? And those people who really need to deal with space-containing filenames can use this other API variant, instead of the recommended and most obvious one. That'd be okay, right? No, of course it wouldn't be okay! James ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python-3.0, unicode, and os.environ
On Fri, Dec 5, 2008 at 19:22, Martin v. Löwis [EMAIL PROTECTED] wrote: Please, don't do that! Bytes are not characters! And environment variables, command line arguments, and file names are not bytes, but characters. On Windows NT, sure. On Unix they're still bytes no matter how much we want them to be characters. This difference, and secondarily the way python 3 tries to sweep it under the rug, seem to be the roots of the problem. -- Michael Urman ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] RELEASED Python 3.0 final
On Sat, 6 Dec 2008 12:47:45 pm Guido van Rossum wrote: But I disagree that most of the cool possibilities they have heard about are necessarily third party libraries. Python's standard library has lots of stuff to offer. +1 on that. I've been using Python for a decade now, and the first third party library I've downloaded and used was Pyparsing a month or two ago. I'll be the first to admit that my programs tend to be on the small size, but they're useful to me. The lack of third party libraries to Python 3 is not necessarily a show-stopper. -- Steven ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python-3.0, unicode, and os.environ
And environment variables, command line arguments, and file names are not bytes, but characters. On Windows NT, sure. On Unix they're still bytes no matter how much we want them to be characters. Only in the API of the OS itself. Treating them as bytes in the application is a mistake. The bytes are intended to represent characters, so Python should treat them as what they are. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python-3.0, unicode, and os.environ
On Sat, 6 Dec 2008 11:48:27 am Nick Coghlan wrote: Toshio Kuratomi wrote: Nick Coghlan wrote: ... Why? Most programs won't be able to do anything with it. And if the program *can* do something with it... that's what the bytes version of the APIs are for. Nonsense. A program can do tons of things with a non-decodable filename. Where it's limited is non-decodable filedata. You can't display a non-decodable filename to the user, hence the user will have no idea what they're working on. Non-filesystem related apps have no business trying to deal with insane filenames. I don't agree. Putting my user's hat on, I know what I would expect: the app should display *some* name, it doesn't matter exactly what, so long as: * it's as close as possible to the real name; * it is unique in that directory (doesn't shadow another file); and * it's enough to identify the file so I can read/save/delete/rename the file. I think there are analogous situations: long-time Windows users will be used to seeing files listed as longfilename.txt in some applications and longfi~1.txt in another. Under POSIX, file names can contain unprintable ctrl characters, and the shell will print them at least three ways, depending on context. E.g. for a file containing a formfeed, I get one of ? \f or ^L in bash. Applications can deal with such weird file names. KDE's file manager (konqueror) and file selection dialog both show the character as a small square, presumably the font's missing character glyph, and KDE apps can open and save the file. Still speaking as a user, I think it is quite reasonable to expect applications to deal with undisplayable filenames: displaying the name and opening the file are orthogonal concepts, although I accept that command-line interfaces will have difficulty with file names that can't be typed by the user! I appreciate that broken unicode is more difficult to deal with than unprintable control characters, but the basic principle is the same. -- Steven ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] RELEASED Python 3.0 final
Thomas Wouters [EMAIL PROTECTED] wrote: Allow me to paraphrase glyph (with whom I'm in complete agreement, for what it's worth): many newbies will be disappointed by Python if they start with Python 3.0 and discover that most of the cool possibilities they had heard about are 'being worked on' and not quite ready. I don't doubt that 3.0 will be easier for the new programmer to learn, but I do not believe the average Oh, I heard about Python, let's learn it person should be pointed to 3.0 right now. They should be encouraged to learn 2.6 -- or even 2.5. I think that's right. I was asked this question today, and it comes up (to me) fairly often at PARC. I usually suggest using the Python version that's standard for the user's platform, if they use OS X or Linux (and most do), which is typically 2.5 (for OS X Leopard), and 2.4 (for Linux -- may be out of date). For Windows users, I suggest the latest release (2.6). Bill ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python-3.0, unicode, and os.environ
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Ulrich Eckhardt wrote: On Friday 05 December 2008, Guido van Rossum wrote: At the risk of bringing up something that was already rejected, let me propose something that follows the path taken in 3.0 for filenames, rather than doubling back: For os.environ, os.getenv() and os.putenv(), I think a similar approach as used for os.listdir() and os.getcwd() makes sense: let os.environ skip variables whose name or value is undecodable, and have a separate os.environb() which contains bytes; let os.getenv() and os.putenv() do the right thing when the arguments passed in are bytes. For sys.argv, because it's positional, you can't skip undecodable values, so I propose to use error=replace for the decoding; again, we can add sys.argvb that contains the raw bytes values. The various os.exec*() and os.spawn*() calls (as well as os.system(), os.popen() and the subprocess module) should all accept bytes as well as strings. On Windows, the bytes APIs should probably not exist. I predict that most developers can get away with not using the bytes APIs at all. The small minority that needs to be robust if not all filenames use the system encoding can use the bytes APIs. I know some of those developers, you can contact them via [EMAIL PROTECTED] Seriously, what would you suggest to someone that wants to handle paths in a portable way? Using the Unicode variants of functions is fubar, because encoding/decoding is not universally possible. Using the byte variant is equally fubar, because e.g. on MS Windows it is not supported, except through a very lossy roundtrip through the locale's codepage, limiting your functionality. I actually think it is about time to give up on trying to think about a path as a string. Dito for data received from os.environ or sys.argv. There are only very few things that are universal to them and a reliable encoding is none of them. Then, once you have let that idea go, meditate a bit over the Zen. What I propose is that paths must be treated as OS-specific, with the only common reliable operations being joining them, concatenating them and splitting them into segments divided by the (again, OS-specific) separator. Other operations, like e.g. appending a string or converting it to a string in order to display it can fail. And if they fail, they should fail noisily. In 99% of all cases, using the default encoding will work and do what people expect, which is why I would make this conversion automatic. In all other cases, it will at least not fail silently (which would lead to garbage and data loss) and allow more sophisticated applications to handle it. Amen! the idea that paths, environment varioables, and stuff pulled off of sockets can be treated as text rather than strings is just wishful thinking. Tres. - -- === Tres Seaver +1 540-429-0999 [EMAIL PROTECTED] Palladion Software Excellence by Designhttp://palladion.com -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFJOgYd+gerLs4ltQ4RArQFAKDUZLXjwsIvNfNji4hbqM/aOZ0lMQCfRBq/ DHdYt2GGA1CrYA4a5pj+AZ4= =4CcT -END PGP SIGNATURE- ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python-3.0, unicode, and os.environ
On Sat, 6 Dec 2008 at 13:06, Steven D'Aprano wrote: Applications can deal with such weird file names. KDE's file manager (konqueror) and file selection dialog both show the character as a small square, presumably the font's missing character glyph, and KDE apps can open and save the file. Still speaking as a user, I think it is quite reasonable to expect applications to deal with undisplayable filenames: displaying the name and opening the file are orthogonal Agreed. I would file a bug report if an application couldn't handle a file that validly exists in my file system, no matter how broken the filename might appear to be. concepts, although I accept that command-line interfaces will have difficulty with file names that can't be typed by the user! Difficult, but not impossible: tab completion in the shell can allow the user to submit otherwise difficult to type filenames to a program. Which means python should be able to handle such things in argument strings, so that my python utilities can manipulate such files when specified as command line argumentsand a sensible error should be generated by default if the program hasn't been written in such a way that it can handle such input. It would be wonderful if all Unix variants would switch to all UTF-8 (I have done so on my own machines...I think :). But it is a slow process. --RDM ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] RELEASED Python 3.0 final
On 5 Dec, 06:10 pm, [EMAIL PROTECTED] wrote: On Thu, Dec 4, 2008 at 11:27 PM, [EMAIL PROTECTED] wrote: With all due respect, for me, library support and serious use are synonymous. Glyph, I cannot have a discussion with you if every single post of yours is longer than my combined daily output. Please spend some time writing shorter posts. I'm sure I'm not the only one here with a short attention span. :-) I already spend a lot of time trying to remove extraneous details. The drafts of these messages are usually 3x as long :). So, trying to keep it short: Thomas paraphrased my point pretty well. The importance of libraries cannot be overemphasized. Maybe you're right and the stdlib is enough for a large audience, but I don't know that audience. Everyone I know who uses Python, uses it because of a library. In some cases, an equivalent library exists for another language, and Python wins because it has a nicer syntax. But, in no case does Python win where it *doesn't* have the library. I think that the marketing for py3 needs to target library vendors before targeting novices. If the novices are targeted first, they are going to have a bad experience when python libraries don't work with py3, and library maintainers are going to have a bad experience when clueless newbies harass them to update their software without understanding the magnitude of the work to do so. I've been predicting this for years, but two days into Python 3's release, I've already seen real-world examples of this pattern in #twisted. I can tell these people to downgrade to py2 when they come ask me for help, but I don't think most of them ask for help. They just get angry and learn Java instead. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python-3.0, unicode, and os.environ
Nick Coghlan writes: True, but it's still a fairly important problem to have a solution to. Even internally in large organisations there can be some pretty insane environments as cruft accumulates over the years. MA and globalization makes it inevitable. Toshio will remember the Mizuho April Fool's Day fiasco (a couple of large banks merged, and when they reopened as a merged entity called Mizuho, the ATM system immediately crashed). Japan being a country that doesn't believe in GAAP, such mergers are a very difficult problem. I don't know the details, but I wouldn't even be surprised if encodings played a role in that mess because Japanese companies often have their own internal variants of the national standard JIS encoding. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python-3.0, unicode, and os.environ
Martin v. Löwis writes: 5) represent all environment variables in Unicode strings, including the ones that currently fail to decode. (then do the same to file names, then drop the byte-oriented file operations again) Please, don't do that! Bytes are not characters! And environment variables, command line arguments, and file names are not bytes, but characters. Unfortunately, both POSIX and OS implementation practice (including, for example, VFAT file systems: NT-derived OSes are not safe!) say otherwise, and that makes your line of argument extremely dangerous. Remember, in a fight between human custom and machine programming, the machine can always win by crashing. For that reason, bytes must be the underlying representation, always available, although I think it's essential to make a text representation easily accessible, and even the default. Humans who would rather kvetch about the machine's breakage than get a useful answer can (and should---problems will be rare for most usage patterns) use the text representation. Humans who want reliability or debuggability, on the other hand, should have something that cannot be mistaken for text immediately available. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] RELEASED Python 3.0 final
On 01:47 am, [EMAIL PROTECTED] wrote: In spite of Python being a programming language, there is a difference between 'casual user of the language' and 'library developer'; 3.0 is certainly a must for all actual library developers, and I'm sure most of them know about 3.0 by now. We're talking about first impressions for people without that knowledge. Well if most library developers already know 3.0 by now, I would hope they aren't going to sit on their hands, and solve the issues at hand! The best thing for 3.0 adoption would be a 3.0 welcoming committee. A group of hackers wandering from one popular open source library to another, writing patches for 3.x compatibility issues. There must be lots of people who care about 3.x adoption, and this is probably the most effective way they can reach that goal. Each time I am going to fix a 3.0 compatibility issue, I have a choice: I can either make Twisted itself better (add features, fix bugs), or I can keep Twisted exactly the same but do lots of work so it will work on 3.0. It seems pretty clear to me that, to the extent that I have time for Twisted, fixing bugs in the HTTP implementation would be a better deal than puzzling through a megabyte of diffs generated by 2to3, trying to understand where it went wrong, and how. This doesn't mean I'm sitting on my hands. It just means I have better things to be doing with my hands. (To be precise, 1054 better things to do, re: Twisted. Add in the Divmod projects and it's more like 3000.) Of course the distant threat of an unmaintained 2.x series is enough to motivate me to push a *little* in this direction, but it doesn't make me happy about it. I think this is exactly what the marketing effort around 3.0 needs to be doing: making a positive case for library and application authors to spend time to update to 3.x. This is a lot of work, and many (I might even say most) of us need a lot of cajoling. Free patches are a good incentive :). ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python-3.0, unicode, and os.environ
There has been some discussion here that users should use the str or byte function variant based on what is relevant to their system, for example when getting a list of file names or opening a file. That thought process really doesn't do much for those of us that write code that needs to run on any platform type, without alteration or the addition of complex if-statements and/or exceptions. Whatever the resolution here, and those of you addressing this thorny issue have my admiration, the solution should be such that it gives consistent behavior regardless of platform type and doesn't require the programmer to know of all the minute details of each possible target platform. That may not be possible for a while, so interim solutions should be such that it minimizes later pain. If that means hiding implementation details behind a new function, so be it. Then, at least, the body of one's app is not burdened with this problem later when conditions change. I'm glad I'm not the only one with hard problems. ;-) Larry ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com