[plasmashell] [Bug 434184] Duplicate emojis

Adam Fontenot Wed, 06 Oct 2021 00:48:30 -0700

https://bugs.kde.org/show_bug.cgi?id=434184


--- Comment #5 from Adam Fontenot <adam.m.fontenot+...@gmail.com> ---
I had a look at fixing this and ran into problems. Specifically, the ibus
interface for fetching emojis is *really* limited. In particular:

1. Ibus itself is affected by this bug. Its internal emoji tool shows the four
transgender flags 🏳️‍⚧️ although they are (incorrectly!!) hidden under variants.

2. None of the data we can easily get out of Ibus contains information about
whether a particular emoji is fully qualified or not.

3. Ibus's variant handling is slapped on top of a data format that was clearly
not built to support modifiers. At runtime, they take their pregenerated
dictionary of emoji and extract a single "base" character which is used to hide
any emoji sequence beginning with that character behind a menu. This is
incorrect behavior - as a result the rainbow and trans flags are hidden behind
a white flag, since that's the first character of their sequence.

For these reasons I think the only way forward is for KDE to maintain its own
emoji dataset and update it with every Unicode release. The basic problem is
that it's very hard to know whether a given emoji is fully qualified or not
without using hardcoded data. For example:

The code receives the emoji 1F3F4 (🏴). It needs to know whether this emoji is
fully qualified (if not, it discards it). It turns out this one is. But 1F3F3
(🏳) is not. The *only* reason there's a difference is that the black flag is
listed in the emoji data files as having a default emoji presentation, and the
white flag does not (it has a text presentation on most platforms).

In addition, if you want proper handling of modifiers using sub-menus (which
would be really nice, especially if you could set a gender / skin color
default), you'll probably need more data than you can get out of Ibus.

Not sure if I'm the right person to work on this. My knowledge of C++ is pretty
limited. I could probably code an ugly workaround, like stripping out the
presentation selector (assuming QString allows iterating by code points), and
using a hashmap to store and find the longest emoji sequence in a group, if
this is unlikely to be fixed any other way.

-- 
You are receiving this mail because:
You are watching all bug changes.

[plasmashell] [Bug 434184] Duplicate emojis

Reply via email to