If you're interested in determining the best encoding match for text, look at 
the TextEncodingConverter.h header, which has functions related to encoding 
sniffing.  There may be more modern techniques available, but I had used that 
almost a decade ago in a formerly major web browser.  It's not perfect, of 
course, but it might be the best solution for your problem.

>
>On May 6, 2008, at 9:22 PM, Jens Alfke wrote:
>
>>
>> On 6 May '08, at 10:45 AM, Aki Inoue wrote:
>>
>>> Actually, I don't recommend using CP1252 as the generic fallback  
>>> encoding like this.
>>> The encoding does have gaps, and the handling of those invalid gaps  
>>> varies between conversion engines.  CF/NSString treat the invalid  
>>> bytes strictly and return nil encountering those.
>>
>> I wasn't aware it had gaps — I've never run into them. Where are they?
>
><http://en.wikipedia.org/wiki/Windows-1252>
>
>5 characters in the 0x80..0x9F range.
>
>>> So, our recommendation now is to try UTF-8 first; then, try some  
>>> other encoding deduced from the context (user's localization,  
>>> intended source/destination of the data, etc).  If all failed,  
>>> should try MacRoman as the ultimate fallback (the encoding has no  
>>> gap so never fails).
>>
>> In the contexts I've been dealing with — data fetched over HTTP from  
>> random websites — there hasn't been anything deducible from the  
>> context (assuming the HTTP Content-Type already failed.) In that  
>> situation MacRoman is not at all a good fallback as almost no Web  
>> content uses it; CP-1252 or ISO-Latin-1 are the most likely  
>> fallbacks after UTF-8.
>
>
>I will agree with this if it's web content you're dealing with.   
>Although, just do a fallback to windows1252.  Lots of site content was  
>authored with that encoding and mistakenly marked as ISO_8859-1.  But  
>that's a topic for another forum.
>
_______________________________________________

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to [EMAIL PROTECTED]

Reply via email to