https://bugs.kde.org/show_bug.cgi?id=412271

--- Comment #6 from Brennan Kinney <polarathene-sig...@hotmail.com> ---
> I doubt this is what you were after.

Actually, consistency would help at least. The mixed space behaviour was
confusing.

> Could you clarify how the decision when to start the incremental search could 
> be improved?

How do users tend to make use of the search field? 

If I want all glyphs related to "q" and I notice I get instant results with
some inputs, then as a user, it'd be assumed with no obvious min input
requirement, that adding spaces to fill UTF-16 values to a count of 3 is
non-obvious.

If I want to search a emoji such as "🤣", again, copy/paste to the search field
has no immediate result. I only found out the added space inputs triggered a
result somehow by mistake, later learning order was not important.

The minimum requirement doesn't help but make it confusing here for what I'd
imagine is a common use-case, to lookup a single character/glyph(not knowing
the text name or unicode value(or whatever U+1F923 is)).

Getting plenty of results during input isn't an issue, it already shows
everything in the current subsection for an empty search field. It's not a
realistic performance concern, so as the user types multiple characters into
the query, those results will filter regardless?

Some indication of minimum input would otherwise be helpful. You mention the
user can press "return/enter" key to avoid the spaces, but there is no UI
"search" button, just immediate results after the min input is reached, thus as
a user it's a less obvious action(beyond assuming enter on a field might do the
expected behaviour, but this for me was dispelled as I had seen the immediate
results with searching unicode values previously, it just did not occur to me
to try).


> KCharSelect doesn't implement the Unicode Emoji standard. It only works with 
> codepoints.

Could you detect this like Konsole does? It notices invisible/zero-width
codepoints and offers to remove them. Perhaps you could remove FE0F(although
valid to search by this value, but not it's "rendered" glyph) and the zwj
codepoint, such that the female spy would paste two separate glyphs(you could
separate them via a space perhaps?). 

Alternatively, upon detection, straight-up inform the user that this type of
input is not supported by KCharSelect, only single/individual codepoints(and
allow the user to figure out what that means).

> The word FACE is unfortunately also a hex word. A group of 4 or 5 hex digits 
> are treated as codepoints.

Yes, I understand that. My confusion was why is that result being returned when
the other part of the query has "confused" which has nothing to do with the
FACE result?

For example "1F601 😂 1F923" as a query, will return only the two codepoints
specified, the emoji glyph in the middle is omitted. Similarly "😁 😂 🤣" equates
to no results. Something is wrong with the query/filtering here, unicode values
are kind of treated as "OR" but the emoji glyphs are like "AND" for keywords(as
in they won't return a result unless all keywords are relevant".

"confused 😕 fac" is similar, the emoji glyph itself doesn't appear to have any
effect at impacting the results, only by itself as "😕 ".

---

SUMMARY

It'd be nice if there was some consistency in these behaviours, and if certain
inputs are not supported, that they could inform the user.

Of interest might be this article:
https://hsivonen.fi/string-length/

It points out how various languages handle such input/strings differently and
why. Perhaps the mentioned Rust crate could be used to improve the current
parsing? (though being another language probably makes that a hard no?)

-- 
You are receiving this mail because:
You are watching all bug changes.

Reply via email to