On 6 May '08, at 10:45 AM, Aki Inoue wrote:

Actually, I don't recommend using CP1252 as the generic fallback encoding like this. The encoding does have gaps, and the handling of those invalid gaps varies between conversion engines. CF/NSString treat the invalid bytes strictly and return nil encountering those.

I wasn't aware it had gaps — I've never run into them. Where are they?

So, our recommendation now is to try UTF-8 first; then, try some other encoding deduced from the context (user's localization, intended source/destination of the data, etc). If all failed, should try MacRoman as the ultimate fallback (the encoding has no gap so never fails).

In the contexts I've been dealing with — data fetched over HTTP from random websites — there hasn't been anything deducible from the context (assuming the HTTP Content-Type already failed.) In that situation MacRoman is not at all a good fallback as almost no Web content uses it; CP-1252 or ISO-Latin-1 are the most likely fallbacks after UTF-8.

—Jens

Attachment: smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to [EMAIL PROTECTED]

Reply via email to