On 2010-11-22 06:57:36 -0500, spir <denis.s...@gmail.com> said:

(*) Actually, once one a has a string of <graphemes/codes/code-units>, rout
ines are the same whatever the kind of element. There could be a generic ve
rsion in std.string.

Just to add to the compexity: graphemes aren't always equivalent to user-perceived characters either. Ligatures can contain more than one user-perceived characters. If you're looking for the substring "flourish" in a string, should it fail to match when it encounters "flourish" just because of the "fl" (fl) ligature? On most Mac applications it matches both thanks to sensible defaults in NSString's search and comparison algorithms.

So perhaps we need yet another layer over graphemes to represent user-perceived characters.

--
Michel Fortin
michel.for...@michelf.com
http://michelf.com/

Reply via email to