On Jun 18, 2008, at 12:24 PM, Ken Thomases wrote:
On Jun 18, 2008, at 1:49 PM, JongAm Park wrote:
Can anyone tell me why the two different data source are displayed
as same "자연", while what it contains are different?
I haven't looked into the specific character sequences in-depth, but
I suspect the difference is in Normalization Forms. Specifically,
form C vs. D.
http://unicode.org/reports/tr15/
The idea is that the same character can be obtained from a single
code point or by several combining code points.
In Cocoa, see -precomposedStringWithCanonicalMapping and -
decomposedStringWithCanonicalMapping.
Sure looks like it, based on the data. EC 9E 90 is U+C790, "자"; E1
84 8C E1 85 A1 is U+110C "ᄌ", U+1161 "ᅡ", which is the decomposed
version of the same thing. -[NSString fileSystemRepresentation] may
also be of use here, given that this is really a file path -- the
normalization form used for file names is dictated by the file system.
--Chris Nebel
AppleScript Engineering
_______________________________________________
Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com
This email sent to [EMAIL PROTECTED]