Martin v. Löwis: > This appears to be based on the usedDefault return value of > WideCharToMultiByte. I believe this is insufficient: > WideCharToMultiByte might convert Unicode characters to > codepage characters in a lossy way, without using the default > character. For example, it converts U+0308 (combining diaeresis) > to U+00A8 (diaeresis) (or something like that, I forgot the > exact details). So if you have, say, "p-umlaut" (i.e. U+0070 > U+0308), it converts it to U+0070 U+00A8 (in the local code page). > Trying to use this as a filename later fails.
There is WC_NO_BEST_FIT_CHARS to defeat that. It says that it will use the default character if the translation can't be round-tripped. Available on WIndows 2000 and XP but not NT4. We could compare the original against the round-tripped as described at http://msdn.microsoft.com/library/default.asp?url=/library/en-us/intl/unicode_2bj9.asp Neil _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com