On Fri, Dec 4, 2020, at 20:53, Florian Schmaus wrote:
> If you count the bytes of the UTF-8 encoded representation, then there
> is no way to have any fallback (as the indexes would be wrong).

Maybe I don't understand the fallback you're proposing. I do understand
your example, and assert that it doesn't matter. You're not likely to
have an invalid offset and if you do then we can define a fallback for
that. It might be "the range ends at the start of the codepoint" (so you
have to decode a single codepoint, not the entire range), or it might be
"this is an invalid range, don't display anything".

> This is, of course, because in the example the number of code points
> and graphemes is identical. But this allows developers to easily
> bootstrap this scheme by simply counting code points in the beginning.
> I wouldn't be surprised if that it would work so well that they never
> even switch to grapheme counting.

We could also easily count bytes and I wouldn't be suprised if that
worked well enough and we don't have to switch to anything else.

—Sam
_______________________________________________
Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: standards-unsubscr...@xmpp.org
_______________________________________________

Reply via email to