On Mar 21, 2013, at 6:05 PM, Andrew Thompson <lordpi...@me.com> wrote:
> > > On Mar 21, 2013, at 2:10 PM, Aki Inoue <a...@apple.com> wrote: > >> For that matter, UTF-32 (aka UCS-4) is not safe to find the truncation >> boundary just at the 4-byte boundary. > > You're thinking of combining marks here? Yes. > It's generally claimed that one can multiply character offsets by 4 to index > into UCS-4 data… which I think I now see is only true depending on your > definition of character; i.e whether one considers a decomposed sequence to > be one character or two. > I see how truncation would be unsafe because you'd chop off the accents etc? Yes. Aki > > _______________________________________________ Cocoa-dev mailing list (Cocoa-dev@lists.apple.com) Please do not post admin requests or moderator comments to the list. Contact the moderators at cocoa-dev-admins(at)lists.apple.com Help/Unsubscribe/Update your Subscription: https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com This email sent to arch...@mail-archive.com