> Originally I thought that this was a valid idea, but then it became > clear that this could be a problem. Consider a filename which includes > a UTF-8 encoding of a PUA code point.
I still think it's a valid idea. For non-UTF-8 file system encodings, use PUA characters, and generate them through an error handler. If the file system encoding is UTF-8, use UTF-8b instead as the file system encoding. > Viewing the PUA with GNOME charmap, I can see that many code points > there have character renderings on my Ubuntu system. I have to assume, > therefore, that there are other (and potentially conflicting) uses for > this unicode feature. Depends on how you use it. If you use the PUA block 1 (i.e. U+E000..U+F8FF), there is a realistic chance of collision. If you use the Plane 15 or Plane 16 PUA blocks, there is currently zero chance of collision (AFAIK). PUA has a wide use for additional characters in TrueType, but I don't think many tools even support plane 15 and 16 for generating fonts, or rendering them (it may even that the TrueType/OpenType format doesn't support them in the first place). However, Python can make use of these planes fairly easily, even in 2-byte mode (through UTF-16). Regards, Martin _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com