Kenneth Whistler scripsit: > Storage of UNIX filenames on Windows databases, for example, > can be done with BINARY fields, which correctly capture the > identity of them as what they are: an unconvertible array of > byte values, not a convertible string in some particular > code page.
This solution, however, is overkill, in the same way that it would be overkill to encode all 8-bit strings in XML using Base-64 just because some of them may contain control characters that are illegal in well-formed XML. > In my opinion, trying to do that with a set of encoded characters > (these 128 or something else) is *less* likely to solve the > problem than using some visible markup convention instead. The trouble with the visible markup, or even the PUA, is that "well-formed filenames", those which are interpretable as UTF-8 text, must also be encoded so as to be sure any markup or PUA that naturally appears in the filename is escaped properly. This is essentially the Quoted-Printable encoding, which is quite rightly known to those stuck with it as "Quoted-Unprintable". > Simply > encoding 128 characters in the Unicode Standard ostensibly to > serve this purpose is no guarantee whatsoever that anyone would > actually implement and support them in the universal way you > envision, any more than they might a "=93", "=94" convention. Why not, when it's so easy to do so? And they'd be *there*, reserved, unassignable for actual character encoding. Plane E would be a plausible location. -- John Cowan <[EMAIL PROTECTED]> http://www.reutershealth.com I amar prestar aen, han mathon ne nen, http://www.ccil.org/~cowan han mathon ne chae, a han noston ne 'wilith. --Galadriel, LOTR:FOTR

