Re: [Python-Dev] Import and unicode: part two

2011-01-25 Thread Stephen J. Turnbull
As Nick points out, nobody really seems to think this is an argument against your patch. I'm going to bow out of this thread after this post, as I'm clearly out of my technical depth. Victor Stinner writes: Le lundi 24 janvier 2011 11:35:22, Stephen J. Turnbull a écrit : ... VFAT-formatted

Re: [Python-Dev] Import and unicode: part two

2011-01-25 Thread Xavier Morel
On 2011-01-25, at 04:26 , Toshio Kuratomi wrote: * If you can pick a set of encodings that are valid (utf-8 for Linux and MacOS HFS+ uses UTF-16 in NFD (actually in an Apple-specific variant of NFD). Right here you've already broken Python modules on OSX. And as far as I know, Linux

Re: [Python-Dev] tahoe-lafs

2011-01-25 Thread Nick Coghlan
On Tue, Jan 25, 2011 at 2:18 AM, Earney, Billy C. ear...@umsystem.eduwrote: I want to make it clear that I am in no way associated with the tahoe-lafs project. I do not want my email to make that project look bad. That was not my intention. Good to know. I was also in a somewhat grumpy

Re: [Python-Dev] PEP 393: Flexible String Representation

2011-01-25 Thread Nick Coghlan
On Tue, Jan 25, 2011 at 6:17 AM, Martin v. Löwis mar...@v.loewis.de wrote: A new function PyUnicode_AsUTF8 is provided to access the UTF-8 representation. It is thus identical to the existing _PyUnicode_AsString, which is removed. The function will compute the utf8 representation when first

Re: [Python-Dev] r88178 - python/branches/py3k/Lib/test/crashers/underlying_dict.py

2011-01-25 Thread Antoine Pitrou
On Tue, 25 Jan 2011 01:00:28 +0100 (CET) benjamin.peterson python-check...@python.org wrote: Author: benjamin.peterson Date: Tue Jan 25 01:00:28 2011 New Revision: 88178 Log: another pretty crasher served up by pypy Some comments would be nice. Right now it looks pretty close to

Re: [Python-Dev] Import and unicode: part two

2011-01-25 Thread exarkun
On 09:22 am, catch-...@masklinn.net wrote: On 2011-01-25, at 04:26 , Toshio Kuratomi wrote: * If you can pick a set of encodings that are valid (utf-8 for Linux and MacOS HFS+ uses UTF-16 in NFD (actually in an Apple-specific variant of NFD). Right here you've already broken Python

Re: [Python-Dev] [Python-checkins] r88155 - python/branches/py3k/Doc/whatsnew/3.2.rst

2011-01-25 Thread Nick Coghlan
On Mon, Jan 24, 2011 at 11:51 AM, raymond.hettinger python-check...@python.org wrote: Author: raymond.hettinger Date: Mon Jan 24 02:51:49 2011 New Revision: 88155 Log: Add entries for dis, dbm, and ctypes. Modified:   python/branches/py3k/Doc/whatsnew/3.2.rst Modified:

Re: [Python-Dev] Import and unicode: part two

2011-01-25 Thread Toshio Kuratomi
On Tue, Jan 25, 2011 at 10:22:41AM +0100, Xavier Morel wrote: On 2011-01-25, at 04:26 , Toshio Kuratomi wrote: * If you can pick a set of encodings that are valid (utf-8 for Linux and MacOS HFS+ uses UTF-16 in NFD (actually in an Apple-specific variant of NFD). Right here you've

Re: [Python-Dev] Location of tests for packages

2011-01-25 Thread Brett Cannon
On Mon, Jan 24, 2011 at 17:19, Raymond Hettinger raymond.hettin...@gmail.com wrote: On Jan 24, 2011, at 3:40 PM, Michael Foord wrote: It isn't just unittest, it seems that all *test packages* are in their respective package and not Lib/test except for the json module where Raymond already

Re: [Python-Dev] r88178 - python/branches/py3k/Lib/test/crashers/underlying_dict.py

2011-01-25 Thread Maciej Fijalkowski
On Tue, Jan 25, 2011 at 1:26 PM, Antoine Pitrou solip...@pitrou.net wrote: On Tue, 25 Jan 2011 01:00:28 +0100 (CET) benjamin.peterson python-check...@python.org wrote: Author: benjamin.peterson Date: Tue Jan 25 01:00:28 2011 New Revision: 88178 Log: another pretty crasher served up by pypy

Re: [Python-Dev] Location of tests for packages

2011-01-25 Thread Alexander Belopolsky
On Tue, Jan 25, 2011 at 12:38 PM, Brett Cannon br...@python.org wrote: .. If we move some modules and not others purely because some distros choose not to ship e.g., ctypes and sqlite3 I don't see why this is a problem. Regrtest already has a mechanism that allows skipping tests based on

Re: [Python-Dev] r88178 - python/branches/py3k/Lib/test/crashers/underlying_dict.py

2011-01-25 Thread Antoine Pitrou
Le mardi 25 janvier 2011 à 20:11 +0200, Maciej Fijalkowski a écrit : On Tue, Jan 25, 2011 at 1:26 PM, Antoine Pitrou solip...@pitrou.net wrote: On Tue, 25 Jan 2011 01:00:28 +0100 (CET) benjamin.peterson python-check...@python.org wrote: Author: benjamin.peterson Date: Tue Jan 25 01:00:28

Re: [Python-Dev] PEP 393: Flexible String Representation

2011-01-25 Thread M.-A. Lemburg
I'll comment more on this later this week... From my first impression, I'm not too thrilled by the prospect of making the Unicode implementation more complicated by having three different representations on each object. I also don't see how this could save a lot of memory. As an example take a

Re: [Python-Dev] PEP 393: Flexible String Representation

2011-01-25 Thread Antoine Pitrou
For the record: I also don't see how this could save a lot of memory. As an example take a French text with say 10mio code points. This would end up appearing in memory as 3 copies on Windows: one copy stored as UCS2 (20MB), one as Latin-1 (10MB) and one as UTF-8 (probably around 15MB,

Re: [Python-Dev] r88178 - python/branches/py3k/Lib/test/crashers/underlying_dict.py

2011-01-25 Thread Martin v. Löwis
Some comments would be nice. Right now it looks pretty close to deliberately obfuscated code (especially with the call to gc.get_referrers()). That call tries to get at the class dictionary, rather then just the dict_proxy that you get from A.__dict__. There should be two referrers to thingy:

Re: [Python-Dev] PEP 393: Flexible String Representation

2011-01-25 Thread Antoine Pitrou
On Tue, 25 Jan 2011 21:08:01 +1000 Nick Coghlan ncogh...@gmail.com wrote: One change I would propose is that rather than hiding flags in the low order bits of the str pointer, we expand the use of the existing state field to cover the representation information in addition to the interning

Re: [Python-Dev] Location of tests for packages

2011-01-25 Thread Nick Coghlan
On Wed, Jan 26, 2011 at 4:16 AM, Alexander Belopolsky alexander.belopol...@gmail.com wrote: FWIW, I am +0 on consolidating tests under Lib/test.  One of the reasons that I have not seen mentioned is that it is well-known that test package is not part of the official stdlib API and can be

Re: [Python-Dev] [Python-checkins] r88197 - python/branches/py3k/Lib/email/generator.py

2011-01-25 Thread Nick Coghlan
On Wed, Jan 26, 2011 at 10:39 AM, victor.stinner python-check...@python.org wrote: Author: victor.stinner Date: Wed Jan 26 01:39:19 2011 New Revision: 88197 Log: Fix BytesGenerator._handle_text() if the message has no payload (None) Folks, for the peace of mind of python-checkins watchers,

Re: [Python-Dev] PEP 393: Flexible String Representation

2011-01-25 Thread Dj Gilcrease
On Tue, Jan 25, 2011 at 5:43 PM, M.-A. Lemburg m...@egenix.com wrote: I also don't see how this could save a lot of memory. As an example take a French text with say 10mio code points. This would end up appearing in memory as 3 copies on Windows: one copy stored as UCS2 (20MB), one as Latin-1

Re: [Python-Dev] [Python-checkins] r88197 - python/branches/py3k/Lib/email/generator.py

2011-01-25 Thread Brett Cannon
This broke the buildbots (R. David Murray thinks you may have forgotten to call super() in the 'payload is None' branch). Are you getting code reviews and fully running the test suite before committing? We are in RC. On Tue, Jan 25, 2011 at 16:39, victor.stinner python-check...@python.org wrote:

Re: [Python-Dev] Import and unicode: part two

2011-01-25 Thread Stephen J. Turnbull
Toshio Kuratomi writes: On Linux there's no defined encoding that will work; file names are just bytes to the Linux kernel so based on people's argument that the convention is and should be that filenames are utf-8 and anything else is a misconfigured system -- python should mandate that

Re: [Python-Dev] Import and unicode: part two

2011-01-25 Thread Toshio Kuratomi
On Wed, Jan 26, 2011 at 11:24:54AM +0900, Stephen J. Turnbull wrote: Toshio Kuratomi writes: On Linux there's no defined encoding that will work; file names are just bytes to the Linux kernel so based on people's argument that the convention is and should be that filenames are utf-8