On Fri, Dec 4, 2020, at 20:53, Florian Schmaus wrote: > If you count the bytes of the UTF-8 encoded representation, then there > is no way to have any fallback (as the indexes would be wrong).
Maybe I don't understand the fallback you're proposing. I do understand your example, and assert that it doesn't matter. You're not likely to have an invalid offset and if you do then we can define a fallback for that. It might be "the range ends at the start of the codepoint" (so you have to decode a single codepoint, not the entire range), or it might be "this is an invalid range, don't display anything". > This is, of course, because in the example the number of code points > and graphemes is identical. But this allows developers to easily > bootstrap this scheme by simply counting code points in the beginning. > I wouldn't be surprised if that it would work so well that they never > even switch to grapheme counting. We could also easily count bytes and I wouldn't be suprised if that worked well enough and we don't have to switch to anything else. —Sam _______________________________________________ Standards mailing list Info: https://mail.jabber.org/mailman/listinfo/standards Unsubscribe: standards-unsubscr...@xmpp.org _______________________________________________