Dominique Pellé <[email protected]> wrote: > Hi > > I found that :help byteidx() is rather confusing > regarding how it treats combining characters. > It says... > > "Composing characters are counted as a separate character." > > I initially interpreted that as combining characters > are not combined, so counted as separate characters. > But that is not what byteidx() does with combining chars. > It actually treats them as a single character. So the > help is misleading or badly worded. I only understood > after experimenting with the attached script which > shows what byteidx() does with/without combining > char: > > === > $ cat byteidx-with-combining-char.vim > " This script illustrates the behavior of byteidx() > " with/without combining chars. > > " Example of string without using composing chars > " Code points: U+002E U+00E U+002E > " utf8 sequences: (0x2e) (0xc3 0xa9) (0x2e) > let s:a = '.é.' > > " Same string but with composing char for the e-acute. > " Code points: U+002E U+0065 + U+0301 U+002E > " utf8 sequences: (0x2e) (0x65 + 0xcc 0x81) (0x2e) > let s:b = '.é.' > > echo 'Testing without combining char' > echo [byteidx(s:a, 0), byteidx(s:a, 1), byteidx(s:a, 2), byteidx(s:a, > 3), byteidx(s:a, 4)] > > echo 'Testing with combining char' > echo [byteidx(s:b, 0), byteidx(s:b, 1), byteidx(s:b, 2), byteidx(s:b, > 3), byteidx(s:b, 4)] > === > > : so byteidx-with-combining-char.vim > Testing without combining char > [0, 1, 3, 4, -1] > Testing with combining char > [0, 1, 4, 5, -1] > > Help is also ambiguous about what is meant by > the "length of string" returned (it's actually a length > in bytes). > > Attached patch makes it clearer I hope.
I precise that the reason I looked at byteidx() with combining characters is because I tried to fix the issue with the LanguageTool plugin of Vim depicted in the screenshot at: https://github.com/languagetool-org/languagetool/issues/23#issuecomment-26176511 The wrong part of the word is highlighted on the 2nd word on the screenshot, because LanguageTool counts columns as characters not combining characters (same as Java String.length()) whereas vim byteidx() function combines characters. So far I have not found a solution to fix the highlighting in the LanguageTool plugin. Dominique -- -- You received this message from the "vim_dev" maillist. Do not top-post! Type your reply below the text you are replying to. For more information, visit http://www.vim.org/maillist.php --- You received this message because you are subscribed to the Google Groups "vim_dev" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.
