On Sat, Feb 28, 2009 at 8:58 AM, Shawn Erickson <shaw...@gmail.com> wrote: > On Sat, Feb 28, 2009 at 8:45 AM, Clark Cox <clarkc...@gmail.com> wrote: > >>>... not sure what Michael is >>> talking about. >> >> On Leopard, invalid bytes will indeed be escaped: > > Ah going back over the email chain I now get the context of the > conversation when Michael made his comment about escaping. > > Anyway I was mostly pointing out that it isn't HSF+ doing this it is > the POSIX APIs which expect UTF-8 and presumably some place now escape > invalid bytes (non-UTF-8). HFS+ as I noted doesn't work with UTF-8.
Ah so the escaping comes from utf8_decodestr (vfs_utfconv.c, shared by all file systems) if you pass the UTF_ESCAPE_ILLEGAL option. If that wasn't specified EINVAL would be returned. /* * utf8_decodestr - Decodes a UTF-8 string into Unicode * * This function takes an UTF-8 input string, utf8p, of utf8len bytes * and produces the Unicode output into a buffer of buflen bytes pointed * to by ucsp. The size of the output in bytes (not including a NULL * termination byte) is returned in ucslen. Both buffers must reside * in kernel memory. * * If '/' chars are allowed in the Unicode output then an alternate * (replacement) char must be provided in altslash. * * FLAGS * UTF_REV_ENDIAN: Unicode byte order is opposite current runtime * * UTF_BIG_ENDIAN: Unicode byte order is always big endian * * UTF_LITTLE_ENDIAN: Unicode byte order is always little endian * * UTF_DECOMPOSED: generate fully decomposed output (NFD) * * UTF_PRECOMPOSED: generate precomposed output (NFC) * * UTF_ESCAPE_ILLEGAL: percent escape any illegal UTF-8 input * * ERRORS * ENAMETOOLONG: output did not fit; only ucslen bytes were decoded. * * EINVAL: illegal UTF-8 sequence encountered. */ At this time it looks like only the HFS+ file system code specifies this flag when converting incoming UTF-8 names to the HFS+ Unicode encoding. Interestingly it isn't universally applied when the HFS+ gets UTF-8 names from its callers... It appears to only happen on catalog entry creation and lookup... it isn't used for attribute name or post creation name comparison (did a very quick look over of the HFS+ code in XNU so I could be misunderstanding the pathways a little) -Shawn _______________________________________________ Cocoa-dev mailing list (Cocoa-dev@lists.apple.com) Please do not post admin requests or moderator comments to the list. Contact the moderators at cocoa-dev-admins(at)lists.apple.com Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com This email sent to arch...@mail-archive.com