On Tue, May 6, 2008 at 10:45 AM, Aki Inoue <[EMAIL PROTECTED]> wrote: > > On 2008/05/06, at 8:56, Jens Alfke wrote: > > > > > > On 6 May '08, at 7:03 AM, Thomas Engelmeier wrote: > > > > > > > As the OP wants to create NSStrings with data created by his application > I'm pretty sure he will not want the the Windows encoding - unless he parses > text documents originating from Windows. > > > > > > > He didn't say where the data originates from, or what those APIs are that > return the strings. If they're networking APIs, the data could very likely > have originated on Windows. > > > > Also, you missed my point about using CP1252 (WinLatin1). It's useful as a > fallback for any unknown C strings because (a) it's a superset of > ISO-Latin-1, which (b) has no gaps in it (as ISO does, from 0x80-0x9F), so > decoding text into an NSString will never fail and return nil. (I've > debugged several crashes that stemmed from nil NSStrings decoded from > garbage strings.) > > > Jens, > > Actually, I don't recommend using CP1252 as the generic fallback encoding > like this. > > The encoding does have gaps, and the handling of those invalid gaps varies > between conversion engines. CF/NSString treat the invalid bytes strictly > and return nil encountering those. > > Also, being compatible with ISO Latin1 (aka ISO 8859-1) is becoming less > compelling reasons in the Net since the overall percentage of the encoding > (both ISO 8859-1 and cp1252 combined) is declining.
Not just declining, completely overtaken by UTF-8: <http://googleblog.blogspot.com/2008/05/moving-to-unicode-51.html> > > > > > > > > If the bytes come from MacOS text files he may want to use the MacRoman > encoding, otherwise creating UTF8 and passing around NSStrings will be the > way to go - especially in Europe where all that äöüñá goodies exist. > > > > > > > For the most part only old (pre-OS X) files would still be using MacRoman. > Current Mac apps generally default to UTF-8. > > > So, our recommendation now is to try UTF-8 first; then, try some other > encoding deduced from the context (user's localization, intended > source/destination of the data, etc). If all failed, should try MacRoman as > the ultimate fallback (the encoding has no gap so never fails). -- Clark S. Cox III [EMAIL PROTECTED]
_______________________________________________ Cocoa-dev mailing list (Cocoa-dev@lists.apple.com) Please do not post admin requests or moderator comments to the list. Contact the moderators at cocoa-dev-admins(at)lists.apple.com Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com This email sent to [EMAIL PROTECTED]