https://bugs.kde.org/show_bug.cgi?id=518035

--- Comment #3 from JATothrim <[email protected]> ---
(In reply to Sven Brauch from comment #2)
> 
> Either way, the linked TODO is unlikely to be the problem on Linux, since
> the "local encoding" is typically UTF-8. For sure the problem exists even
> for such files.

You are right, this is not cause of the bug:
QString::fromLocal8Bit(file.readAll()).toUtf8() does nothing for the bug. My
intent was to just forward the links in the last post in discuss and confirm
the bug.

I have not analyzed the bug in detail, and I'm just reading the code currently.
The todo / parsejob.cpp is however a place where to start reading the code at
least.

> Kate's cursor position is counted in UTF16 surrogates (which is somewhat 
> curious for 2026 tech).

000001e0  6c f0 9f 98 84 22 20 3c  3c 20 6c 6f 6f 6f 6f 6f  |l...." << looooo|

000001e0  6c 61 ef bf bd 22 20 3c  3c 20 6c 6f 6f 6f 6f 6f  |la..." << looooo|

In overwrite mode, typing over an multi-byte codepoint with e.g. "a" breaks it.
(happens also in Kate)

> Maybe some languages / situations report ranges in bytes, not unicode code 
> points?

I did a quick test in a CMakeLists.txt file, and I didn't see the bug
immediately happening there.

Searching for clang_getFileLocation() gives a better starting point to look for
this bug. clang CXSourceLocation annoyingly doesn't tell in which unit (bytes,
codepoints?) it reports the location.

-- 
You are receiving this mail because:
You are watching all bug changes.

Reply via email to