Guido van Rossum wrote:
> Ah, sigh. I didn't know that os.listdir() behaves differently when the
> argument is Unicode. Does os.listdir(".") really behave differently
> than os.listdir(u".")? Bah! I don't think that's a very good design
> (although I see where it comes from). Promoting only those entries
> that need it seems the right solution

Unfortunately, this solution is hard to implement (I don't know whether
it is implementable at all correctly; atleast on Windows, I see no
way to implement it efficiently).

Here are a number of problems/questions:
- On Windows, should listdir use the narrow or the wide API? Obviously
  the wide API, since it is not Python which returns the question marks,
  but the Windows API.
- But then, the wide API gives all results as Unicode. If you want to
  promote only those entries that need it, it really means that you
  only want to "demote" those that don't need it. But how can you tell
  whether an entry needs it? There is no API to find out.
  You could declare that anything with characters >128 needs it,
  but that would be an incompatible change: If a character >128 in
  the system code page is in a file name, listdir currently returns
  it in the system code page. It then would return a Unicode string.
  Applications relying on the olde behaviour would break.
- On Unix, all file names come out as byte strings. Again, how do
  you know which ones to promote, and using what encoding? Python
  currently guesses an encoding, but that may or may not be the one
  intended for the file name.

So the general "Bah!" doesn't really help much: when it comes to
a specific algorithm to implement, the options are scarce.

Regards,
Martin
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to