Thanks, John. I'll have a look and let you know if I need more help! -Laurent. -- Laurent Daudelin AIM/iChat/Skype:LaurentDaudelin http://www.nemesys-soft.com/ Logiciels Nemesys Software laur...@nemesys-soft.com
On Apr 26, 2011, at 12:39, John Pannell wrote: > Hi Laurent- > > I have an app that collects a lot of text off the web; my string creation > algorithm is something like the following: > > 1. Attempt to create an NSString with NSUTF8StringEncoding. > 2. If the string is nil, attempt to create the string using the encoding > returned from the server. > 3. If string is still nil, ask the Text Encoding Conversion Manager to sniff > out the encoding from the data. > 3a. This returns an array of likely encodings. For each item in the > array: > 3b. Attempt to create a string with the encoding. > > There was a little too much code associated with this to copy/paste into > email, but I'd be happy to share... I have a wrapper object for the needed > interaction with the Text Encoding Conversion Manager. Some more about it: > > http://developer.apple.com/library/mac/#documentation/Carbon/Reference/Text_Encodin_sion_Manager/Reference/reference.html%23//apple_ref/doc/uid/TP30000123 > > Hope this helps! > > On Apr 26, 2011, at 12:53 PM, Nick Zitzmann wrote: > >> >> On Apr 26, 2011, at 12:49 PM, Laurent Daudelin wrote: >> >>>> TextEdit's encoding guesser just uses the built-in NSAttributedString >>>> method -initWithURL:options:documentAttributes:error:, which will guess >>>> the file's encoding when opening it. But it has been mentioned that >>>> heuristics are not infallible, and this method's heuristics are no >>>> exception. It does a good job overall, but I've found that it usually >>>> misinterprets UTF-8 format text. >>> >>> Yes, I know that all the guess jobs can fail. I was starting to be excited >>> when started reading your reply but if it usually misinterprets UTF-8, >>> that's a pretty significant problem... >> >> That was a long time ago, so it may have been fixed. But if it's still >> happening, then one workaround would be to try and open the file as UTF-8 >> first, and if that fails, then fall back on the above method. The UTF-8 >> parser often returns nil on text that is not in UTF-8 format IIRC. >> > _______________________________________________ Cocoa-dev mailing list (Cocoa-dev@lists.apple.com) Please do not post admin requests or moderator comments to the list. Contact the moderators at cocoa-dev-admins(at)lists.apple.com Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com This email sent to arch...@mail-archive.com