On Sat, Oct 19, 2019 at 04:41:19PM +0000, Sam Whited wrote: > On Sat, Oct 19, 2019, at 04:57, JC Brand wrote: > > You might still have an offset in between two codepoints that should > > ideally be shown together like "EU" making the EU flag, but this seems > > less of an issue to me. > > I don't know if this is better or not, and I'm still not sure how best > to handle it. If you end up with text in the middle of a UTF-8 encoding, > at least that's clearly an error. If it's in between the two letters in > a flag emoji, that's not necessarily an error and there are tons of > different ways you could handle it, which seems much more complex
You don't need tons of ways, you can just follow the instructions. If the sending client is buggy, then this will become clear over time. > Does this break the flag emoji back into the letter glyphs that are > shown if it doesn't form a flag? Yes, you just render the two letters separately given that this is what's implied by the information you've been given and it's also a legitimate use-case. By referencing only one of two consecutive letter glyphs, you're indicating that they're logically distinct, so it makes sense that they're not rendered together. In any case, usually you'll want to somehow highlight, make clickable or replace the referenced text, thereby affirming the need to render them separately. > What if it's between something and a > zero-width joiner that would join it to another glyph, does that split > that and now you have a dangling joiner? This is as clearly an error as setting an offset in the middle of a UTF-8 encoding. > From a code perspective does > this mean that highlighting always has to integrate with the text > rendering engine? This seems like a *major* downside to me, as it likely > makes the code much more complicated, and we may or may not even have > the ability to manipulate how the text rendering engine handles things. It's not clear to me why you think highlighting will necessarily require integration with the rendering engine. It should be possible to identify unicode codepoints in a string independent of any rendering engine.
signature.asc
Description: PGP signature
_______________________________________________ Standards mailing list Info: https://mail.jabber.org/mailman/listinfo/standards Unsubscribe: standards-unsubscr...@xmpp.org _______________________________________________