On May 5, 2014, at 2:06 PM, Jens Alfke wrote:

> How can I map a byte offset in a UTF-8 string back to the corresponding 
> character offset in the NSString it came from?

I don't think there's a great way.

You can do the reverse, map a character (really a UTF-16 code unit) offset to a 
UTF-8 offset using CFStringGetBytes().  You'd pass in a range from 0 to the 
index you want to map and NULL for the buffer.  It will fill in *usedBufLen 
with the length in bytes that would be required by the conversion.

You could build the reverse map by doing that repeatedly for each character 
index, but that would be expensive.  You'd also have to tolerate failure in 
case a given character index can't be converted (if it references half of a 
surrogate pair, for example).

So, I suspect that your best bet will be to do the conversion to UTF-8 yourself 
and build the index map as you go.

Regards,
Ken


_______________________________________________

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Reply via email to