Re: [Python-Dev] zipfile and unicode filenames

2007-06-10 Thread Alexey Borzenkov
On 6/11/07, Alexey Borzenkov <[EMAIL PROTECTED]> wrote: > The problem is that I don't know if anything actually supports bit 11 > at the time and can't even tell if I did this correctly or not. :( I downloaded the latest WinZip and can confirm that it parses utf-8 filenames correctly (although it

[Python-Dev] Question about dictobject.c:lookdict_string

2007-06-10 Thread Eyal Lotem
My question is specifically regarding the transition back from lookdict_string (the initial value) to the general lookdict. Currently, when a string-only dict is trying to look up any non-string, it reverts back to a general lookdict. Wouldn't it be better (especially in the more important case o

Re: [Python-Dev] zipfile and unicode filenames

2007-06-10 Thread Alexey Borzenkov
On 6/11/07, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > For compatibility, I would propose to use UTF-8 only if the file > name is not ASCII. Even though the OEM code pages vary, they > are (mostly) ASCII supersets. So if the string can be encoded > in ASCII, there is no need to set the UTF-8 fl

Re: [Python-Dev] Frame zombies

2007-06-10 Thread Martin v. Löwis
> I am not sure how to benchmark such modifications. Is there any > benchmark that includes threaded use of the same functions in typical > use cases? I don't think it's necessary to benchmark that specific case - *any* kind of micro-benchmark would be better than none. If you want to introduce fr

Re: [Python-Dev] Frame zombies

2007-06-10 Thread Eyal Lotem
On 6/10/07, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > > Note that _only_ recursions will have more than 1 frame attached. > > That's not true; in the presence of threads, the same method > may also be invoked more than one time simultaneously. Yes, I have missed that, and realized that I miss

Re: [Python-Dev] Instance variable access and descriptors

2007-06-10 Thread Aahz
On Sun, Jun 10, 2007, Eyal Lotem wrote: > > Python, probably through the valid assumption that most attribute > lookups go to the class, tries to look for the attribute in the class > first, and in the instance, second. > > What Python currently does is quite peculiar! > Here's a short descriptio

Re: [Python-Dev] zipfile and unicode filenames

2007-06-10 Thread Martin v. Löwis
> But this is only on Windows! I have no clue what's the common > situation on other OSes and don't even know how to sanely get OEM > codepage on Windows (the obvious way with ctypes.kernel32.GetOEMCP() > doesn't seem good to me). > > So I guess that's bad idea anyway, maybe conforming to language

Re: [Python-Dev] zipfile and unicode filenames

2007-06-10 Thread Alexey Borzenkov
On 6/10/07, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > > So the general idea is that at least directory filename has some sort > > of convention of using oem (dos, console) encoding on Windows, cp866 > > in my case. Header filenames have different encodings, and seem to be > > ignored. > Ok, th

Re: [Python-Dev] zipfile and unicode filenames

2007-06-10 Thread Martin v. Löwis
> So the general idea is that at least directory filename has some sort > of convention of using oem (dos, console) encoding on Windows, cp866 > in my case. Header filenames have different encodings, and seem to be > ignored. Ok, then this is what the zipfile module should implement. >> That woul

Re: [Python-Dev] Fwd: Instance variable access and descriptors

2007-06-10 Thread Phillip J. Eby
At 11:27 AM 6/10/2007 +0100, Gustavo Carneiro wrote: > I have to agree with you. If removing support for > self.__dict__['propertyname'] (where propertyname is also the name > of a descriptor) is the price to pay for significant speedup, so be > it. People doing that are asking for trouble a

Re: [Python-Dev] zipfile and unicode filenames

2007-06-10 Thread Alexey Borzenkov
On 6/10/07, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > > I don't think always encoding them to utf-8 (and using bit 11 of > > flag_bits) is a good idea, since there's a chance to create archives > > that won't be correctly readable by programs not supporting this bit > > (it's no secret that cu

Re: [Python-Dev] Fwd: Instance variable access and descriptors

2007-06-10 Thread Phillip J. Eby
At 04:14 AM 6/10/2007 +0300, Eyal Lotem wrote: >On 6/10/07, Phillip J. Eby <[EMAIL PROTECTED]> wrote: > > At 12:23 AM 6/10/2007 +0300, Eyal Lotem wrote: > > >A. It will break code that uses instance.__dict__['var'] directly, > > >when 'var' exists as a property with a __set__ in the class. I believ

Re: [Python-Dev] zipfile and unicode filenames

2007-06-10 Thread Martin v. Löwis
> I don't think always encoding them to utf-8 (and using bit 11 of > flag_bits) is a good idea, since there's a chance to create archives > that won't be correctly readable by programs not supporting this bit > (it's no secret that currently some programs just assume that > filenames are encoded us

Re: [Python-Dev] zipfile and unicode filenames

2007-06-10 Thread Alexey Borzenkov
> Current zipfile seems to officially support ascii filenames only > anyway, so the patch can be as simple as this: Submitted patch and test case as http://python.org/sf/1734346 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mai

Re: [Python-Dev] zipfile and unicode filenames

2007-06-10 Thread Alexey Borzenkov
> > Also note that I'm trying to ask if zipfile should be improved, how it > > should be improved, and this possible improvement is not even for me > > (because now I know how zipfile behaves and I will work correctly with > > it, but someone else might stumble upon this very unexpectedly). > If yo

Re: [Python-Dev] Fwd: Instance variable access and descriptors

2007-06-10 Thread Gustavo Carneiro
I have to agree with you. If removing support for self.__dict__['propertyname'] (where propertyname is also the name of a descriptor) is the price to pay for significant speedup, so be it. People doing that are asking for trouble anyway! On 10/06/07, Eyal Lotem <[EMAIL PROTECTED]> wrote: On

Re: [Python-Dev] zipfile and unicode filenames

2007-06-10 Thread Martin v. Löwis
> sys.setdefaultencoding() > exists for a reason, wouldn't it be better if stdlib could cope with > that at least with zipfile? sys.setdefaultencoding just does not work. Many more things break when you call it. It only exists because people like you insisted that it exists. > Also note that I'm