https://bugs.kde.org/show_bug.cgi?id=518035
--- Comment #3 from JATothrim <[email protected]> --- (In reply to Sven Brauch from comment #2) > > Either way, the linked TODO is unlikely to be the problem on Linux, since > the "local encoding" is typically UTF-8. For sure the problem exists even > for such files. You are right, this is not cause of the bug: QString::fromLocal8Bit(file.readAll()).toUtf8() does nothing for the bug. My intent was to just forward the links in the last post in discuss and confirm the bug. I have not analyzed the bug in detail, and I'm just reading the code currently. The todo / parsejob.cpp is however a place where to start reading the code at least. > Kate's cursor position is counted in UTF16 surrogates (which is somewhat > curious for 2026 tech). 000001e0 6c f0 9f 98 84 22 20 3c 3c 20 6c 6f 6f 6f 6f 6f |l...." << looooo| 000001e0 6c 61 ef bf bd 22 20 3c 3c 20 6c 6f 6f 6f 6f 6f |la..." << looooo| In overwrite mode, typing over an multi-byte codepoint with e.g. "a" breaks it. (happens also in Kate) > Maybe some languages / situations report ranges in bytes, not unicode code > points? I did a quick test in a CMakeLists.txt file, and I didn't see the bug immediately happening there. Searching for clang_getFileLocation() gives a better starting point to look for this bug. clang CXSourceLocation annoyingly doesn't tell in which unit (bytes, codepoints?) it reports the location. -- You are receiving this mail because: You are watching all bug changes.
