Thanks for the feedback. You're correct about this; that is a holdover from an earlier version of the document when there was a more basic treatment of RI sequences.
There is already an action to modify these. There is a placeholder review note about that just above http://www.unicode.org/reports/tr29/proposed.html#Table_Combining_Char_Sequences_and_Grapheme_Clusters (scroll up just a bit). Mark Mark <https://twitter.com/mark_e_davis> On Sun, Dec 17, 2017 at 4:16 PM, David P. Kendal via Unicode < unicode@unicode.org> wrote: > Hi, > > It’s possible I’m missing something, but the formal grammar/regular > expression given for extended grapheme clusters appears to have a bug > in it. > <https://unicode.org/reports/tr29/#Table_Combining_Char_ > Sequences_and_Grapheme_Clusters> > > The bug is here: > > RI-Sequence := Regional_Indicator+ > > If the formal grammar is intended to exactly match the rules given the > the “Grapheme Cluster Boundary Rules” section below it as-is, then > this should be > > RI-Sequence := Regional_Indicator Regional_Indicator > > since as given it would cause any number of RI characters to coalesce > into a single grapheme cluster, instead of pairs of characters. That > is, the text U+1F1EC U+1F1E7 U+1F1EA U+1F1FA would represent one > grapheme cluster instead of the correct two. > > -- > dpk (David P. Kendal) · Nassauische Str. 36, 10717 DE · http://dpk.io/ > we do these things not because they are easy, +49 159 03847809 > but because we thought they were going to be easy > — ‘The Programmers’ Credo’, Maciej Cegłowski > > >