On Apr 26, 2011, at 11:43, Nick Zitzmann wrote: > On Apr 26, 2011, at 12:13 PM, Laurent Daudelin wrote: > >> I've found different ways to do that (some pure Cocoa, some using Carbon) >> but I was wondering about the wisdom of this list as to what is the best way >> to detect the encoding of a file before passing it to NSString >> initWithContentsOfFile:encoding:error:? > > TextEdit's encoding guesser just uses the built-in NSAttributedString method > -initWithURL:options:documentAttributes:error:, which will guess the file's > encoding when opening it. But it has been mentioned that heuristics are not > infallible, and this method's heuristics are no exception. It does a good job > overall, but I've found that it usually misinterprets UTF-8 format text.
I finally got around building a little test program and I tried a few files that were sent to me by a coworker from a Windows machine compressed into a zip archive. I did unarchive the files and fed them to my test program. All of them failed mightily. Just a bunch symbols showing clearly that the framework cannot guess the encoding. Each file had a specific encoding but each time, it guessed the encoding as NSMacOSRomanStringEncoding which is plainly wrong. I tried opening the files in TextEdit and they show up the same way, wrong encoding. Looks like I'll have to look for other suggested alternatives…. -Laurent. -- Laurent Daudelin AIM/iChat/Skype:LaurentDaudelin http://www.nemesys-soft.com/ Logiciels Nemesys Software laur...@nemesys-soft.com _______________________________________________ Cocoa-dev mailing list (Cocoa-dev@lists.apple.com) Please do not post admin requests or moderator comments to the list. Contact the moderators at cocoa-dev-admins(at)lists.apple.com Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com This email sent to arch...@mail-archive.com