On Apr 26, 2011, at 11:43, Nick Zitzmann wrote:

> On Apr 26, 2011, at 12:13 PM, Laurent Daudelin wrote:
> 
>> I've found different ways to do that (some pure Cocoa, some using Carbon) 
>> but I was wondering about the wisdom of this list as to what is the best way 
>> to detect the encoding of a file before passing it to NSString 
>> initWithContentsOfFile:encoding:error:?
> 
> TextEdit's encoding guesser just uses the built-in NSAttributedString method 
> -initWithURL:options:documentAttributes:error:, which will guess the file's 
> encoding when opening it. But it has been mentioned that heuristics are not 
> infallible, and this method's heuristics are no exception. It does a good job 
> overall, but I've found that it usually misinterprets UTF-8 format text.

I finally got around building a little test program and I tried a few files 
that were sent to me by a coworker from a Windows machine compressed into a zip 
archive. I did unarchive the files and fed them to my test program. All of them 
failed mightily. Just a bunch symbols showing clearly that the framework cannot 
guess the encoding. Each file had a specific encoding but each time, it guessed 
the encoding as NSMacOSRomanStringEncoding which is plainly wrong. I tried 
opening the files in TextEdit and they show up the same way, wrong encoding.

Looks like I'll have to look for other suggested alternatives….

-Laurent.
-- 
Laurent Daudelin
AIM/iChat/Skype:LaurentDaudelin                                 
http://www.nemesys-soft.com/
Logiciels Nemesys Software                                      
laur...@nemesys-soft.com

_______________________________________________

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Reply via email to