On Tue, May 6, 2008 at 10:45 AM, Aki Inoue <[EMAIL PROTECTED]> wrote:
>
>  On 2008/05/06, at 8:56, Jens Alfke wrote:
>
>
> >
> > On 6 May '08, at 7:03 AM, Thomas Engelmeier wrote:
> >
> >
> > > As the OP wants to create NSStrings with data created by his application
> I'm pretty sure he will not want the the Windows encoding - unless he parses
> text documents originating from Windows.
> > >
> >
> > He didn't say where the data originates from, or what those APIs are that
> return the strings. If they're networking APIs, the data could very likely
> have originated on Windows.
> >
> > Also, you missed my point about using CP1252 (WinLatin1). It's useful as a
> fallback for any unknown C strings because (a) it's a superset of
> ISO-Latin-1, which (b) has no gaps in it (as ISO does, from 0x80-0x9F), so
> decoding text into an NSString will never fail and return nil. (I've
> debugged several crashes that stemmed from nil NSStrings decoded from
> garbage strings.)
> >
>  Jens,
>
>  Actually, I don't recommend using CP1252 as the generic fallback encoding
> like this.
>
>  The encoding does have gaps, and the handling of those invalid gaps varies
> between conversion engines.  CF/NSString treat the invalid bytes strictly
> and return nil encountering those.
>
>  Also, being compatible with ISO Latin1 (aka ISO 8859-1) is becoming less
> compelling reasons in the Net since the overall percentage of the encoding
> (both ISO 8859-1 and cp1252 combined) is declining.

Not just declining, completely overtaken by UTF-8:

<http://googleblog.blogspot.com/2008/05/moving-to-unicode-51.html>

>
>
>
> >
> > > If the bytes come from MacOS text files he may want to use the MacRoman
> encoding, otherwise creating UTF8 and passing around NSStrings will be the
> way to go - especially in Europe where all that äöüñá goodies exist.
> > >
> >
> > For the most part only old (pre-OS X) files would still be using MacRoman.
> Current Mac apps generally default to UTF-8.
> >
>  So, our recommendation now is to try UTF-8 first; then, try some other
> encoding deduced from the context (user's localization, intended
> source/destination of the data, etc).  If all failed, should try MacRoman as
> the ultimate fallback (the encoding has no gap so never fails).


-- 
Clark S. Cox III
[EMAIL PROTECTED]
_______________________________________________

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to [EMAIL PROTECTED]

Reply via email to