On Thu, 24 Apr 2014 23:07:58 +0200
Mathias Bynens wrote:
> I realize reversing a string has nothing to do with text segmentation
> – but ignoring grapheme extenders leads to unexpected results (since
> after reversing the code points, the grapheme extender might extend
> the wrong character):
> h
On Thu, 24 Apr 2014 19:38:54 +
"Whistler, Ken" wrote:
> Yes. Grapheme_Extend characters per se do not "apply" to anything.
> They are a mixture of different General_Category types -- mostly
> combining marks, but not all. The concept of applying to a base only
> refers to combining marks prop
On 24 Apr 2014, at 21:38, Whistler, Ken wrote:
> Grapheme_Extend characters per se do not "apply" to anything.
> They are a mixture of different General_Category types -- mostly combining
> marks, but not all. The concept of applying to a base only refers to
> combining marks proper.
>
> The pro
d by
> `Grapheme_Extend` characters (which includes the code points in
> `Other_Grapheme_Extend`)?
>
> The email subject should have been “Do `Grapheme_Extend` characters only
> apply to `Grapheme_Base`?” — sorry for the confusion.
>
> Does anyone know the answer?
Yes.
Mathias Bynens wrote:
> Let's say I'm writing a program that strips combining characters and
> grapheme extenders from an input string.
>
> For combining marks, I'm looking for any non-combining marks (e.g.
> 'a') followed by one or more combining marks (e.g. ' ̃'), and then I
> remove everything
racters
> (which includes the code points in `Other_Grapheme_Extend`)?
The email subject should have been “Do `Grapheme_Extend` characters only apply
to `Grapheme_Base`?” — sorry for the confusion.
Does anyone know the answer?
___
Unicode mail
6 matches
Mail list logo