On Sat, Oct 19, 2019 at 04:41:19PM +0000, Sam Whited wrote:
> On Sat, Oct 19, 2019, at 04:57, JC Brand wrote:
> > You might still have an offset in between two codepoints that should
> > ideally be shown together like "EU" making the EU flag, but this seems
> > less of an issue to me.
> 
> I don't know if this is better or not, and I'm still not sure how best
> to handle it. If you end up with text in the middle of a UTF-8 encoding,
> at least that's clearly an error. If it's in between the two letters in
> a flag emoji, that's not necessarily an error and there are tons of
> different ways you could handle it, which seems much more complex

You don't need tons of ways, you can just follow the instructions. If the
sending client is buggy, then this will become clear over time.

> Does this break the flag emoji back into the letter glyphs that are
> shown if it doesn't form a flag?

Yes, you just render the two letters separately given that this is
what's implied by the information you've been given and it's also a 
legitimate use-case.

By referencing only one of two consecutive letter glyphs, you're indicating
that they're logically distinct, so it makes sense that they're not rendered
together. In any case, usually you'll want to somehow highlight, make clickable
or replace the referenced text, thereby affirming the need to render them
separately.

> What if it's between something and a
> zero-width joiner that would join it to another glyph, does that split
> that and now you have a dangling joiner?

This is as clearly an error as setting an offset in the middle of a UTF-8
encoding.

> From a code perspective does
> this mean that highlighting always has to integrate with the text
> rendering engine? This seems like a *major* downside to me, as it likely
> makes the code much more complicated, and we may or may not even have
> the ability to manipulate how the text rendering engine handles things.

It's not clear to me why you think highlighting will necessarily require
integration with the rendering engine. It should be possible to identify
unicode codepoints in a string independent of any rendering engine.

Attachment: signature.asc
Description: PGP signature

_______________________________________________
Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: standards-unsubscr...@xmpp.org
_______________________________________________

Reply via email to