https://bugs.documentfoundation.org/show_bug.cgi?id=57307

--- Comment #10 from himajin100...@gmail.com ---
1. 
When your selection length is 0, Basic IDE tries to find what is the name of
the identifier

https://opengrok.libreoffice.org/xref/core/basctl/source/basicide/baside2.cxx?r=4a6dc219#686

2.  it relies on BreakIterator for Word Break

https://opengrok.libreoffice.org/xref/core/editeng/source/editeng/editeng.cxx?r=8feca893#645
https://opengrok.libreoffice.org/xref/core/editeng/source/editeng/editeng.cxx?r=8feca893#841
https://opengrok.libreoffice.org/xref/core/editeng/source/editeng/impedit2.cxx?r=8feca893#1534

3. BreakIterator uses data for "edit_word" (I'm not so sure)

https://opengrok.libreoffice.org/xref/core/i18npool/source/breakiterator/breakiterator_unicode.cxx?r=da95fc29#94

4. So-called underscore,whose official name is U+005F LOW LINE and who is in
"Punctuation, Connector [Pc]" Category is categorized into ExtendNumLet in
UAX#29

https://unicode.org/reports/tr29/#ExtendNumLetWB

5. WB13a and WB13b says this character breaks neither before and after
alphabets.

https://unicode.org/reports/tr29/#WB13a
https://unicode.org/reports/tr29/#WB13b

6.
While icu4c's official data has a section for WB13a/WB13b,

https://github.com/unicode-org/icu/blob/master/icu4c/source/data/brkitr/rules/word.txt#L159

LibreOffice does not have such a section.

https://opengrok.libreoffice.org/xref/core/i18npool/source/breakiterator/data/edit_word.txt?r=f8f05d43#105

Can this be the cause?

ADDITIONAL REFERENCE:

ICU's break iterator does not consider underscores as word-breaking punctuation 
https://bugs.chromium.org/p/chromium/issues/detail?id=364301

-- 
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs

Reply via email to