[Bug 171165] Byte Position Functions count characters with unicode point greater than U+FFFF as 4 bytes

bugzilla-daemon Fri, 06 Mar 2026 11:06:02 -0800

https://bugs.documentfoundation.org/show_bug.cgi?id=171165


--- Comment #6 from Mike Kaganski <[email protected]> ---
Working on https://gerrit.libreoffice.org/c/core/+/201151, I discovered the
reason of this *bug*: the implementation doesn't handle characters (Unicode
code points), but UTF-16 code units; therefore, for a SMP character (like an
emoji), it processes both surrogates; and since surrogates (both high and low)
are considered DBCS, it gives 4.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 171165] Byte Position Functions count characters with unicode point greater than U+FFFF as 4 bytes

Reply via email to