FWIW I was a big proponent of doing it this way too, but I've changed my mind after seeing too many grapheme segmentation implementations be broken in small, different, ways. My new position is that we have to just count bytes and figure out a sane behavior in case someone sends us an invalid offset in the middle of a codepoint or something. This is encoding agnostic (not that it matters for XMPP) and makes it very easy to count: go to that byte offset, check if we're on any sort of UTF-8 boundary, if so call it a day, if not do whatever the fallback is.
—Sam On Fri, Dec 4, 2020, at 14:15, Florian Schmaus wrote: > Reply containing rant about how unpractical grapheme cluster counting > is in 3, 2, 1… :) _______________________________________________ Standards mailing list Info: https://mail.jabber.org/mailman/listinfo/standards Unsubscribe: standards-unsubscr...@xmpp.org _______________________________________________