Thanks, John. I'll have a look and let you know if I need more help!

-Laurent.
-- 
Laurent Daudelin
AIM/iChat/Skype:LaurentDaudelin                                 
http://www.nemesys-soft.com/
Logiciels Nemesys Software                                      
laur...@nemesys-soft.com

On Apr 26, 2011, at 12:39, John Pannell wrote:

> Hi Laurent-
> 
> I have an app that collects a lot of text off the web; my string creation 
> algorithm is something like the following:
> 
> 1.  Attempt to create an NSString with NSUTF8StringEncoding.
> 2.  If the string is nil, attempt to create the string using the encoding 
> returned from the server.
> 3.  If string is still nil, ask the Text Encoding Conversion Manager to sniff 
> out the encoding from the data.
>       3a.  This returns an array of likely encodings.  For each item in the 
> array:
>       3b.  Attempt to create a string with the encoding.
> 
> There was a little too much code associated with this to copy/paste into 
> email, but I'd be happy to share... I have a wrapper object for the needed 
> interaction with the Text Encoding Conversion Manager.  Some more about it:
> 
> http://developer.apple.com/library/mac/#documentation/Carbon/Reference/Text_Encodin_sion_Manager/Reference/reference.html%23//apple_ref/doc/uid/TP30000123
> 
> Hope this helps!
> 
> On Apr 26, 2011, at 12:53 PM, Nick Zitzmann wrote:
> 
>> 
>> On Apr 26, 2011, at 12:49 PM, Laurent Daudelin wrote:
>> 
>>>> TextEdit's encoding guesser just uses the built-in NSAttributedString 
>>>> method -initWithURL:options:documentAttributes:error:, which will guess 
>>>> the file's encoding when opening it. But it has been mentioned that 
>>>> heuristics are not infallible, and this method's heuristics are no 
>>>> exception. It does a good job overall, but I've found that it usually 
>>>> misinterprets UTF-8 format text.
>>> 
>>> Yes, I know that all the guess jobs can fail. I was starting to be excited 
>>> when started reading your reply but if it usually misinterprets UTF-8, 
>>> that's a pretty significant problem...
>> 
>> That was a long time ago, so it may have been fixed. But if it's still 
>> happening, then one workaround would be to try and open the file as UTF-8 
>> first, and if that fails, then fall back on the above method. The UTF-8 
>> parser often returns nil on text that is not in UTF-8 format IIRC.
>> 
> 

_______________________________________________

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Reply via email to