Charles E Campbell Jr wrote: > Hello! > > I've received a couple of requests about getting Align.vim to work with > utf-8 characters. As an example, consider: > > let x='grĂ¼n' > echo "strlen(x)=".strlen(x) > > Thus, strlen() returns 5, not 4 as one might (sometimes) expect. So, I > tried a workaround: > > fun! Strlen(x) > 1split > enew > call setline(1,a:x) > let ret= virtcol("$") - 1 > bwipe! > return ret > endfun > > echo Strlen(x) > > now returns 4 (at the price of using interpreted code over built-in > strlen()). So, is this the best that can be done? > I'd prefer to have a built-in compiled function for this. > > Regards, > Chip Campbell
It all depends on what exactly you want to do. (I haven't read the Align.vim docs.) The length of a UTF-8 string can be counted in several nonequivalent ways: - number of bytes (Latin a + combining circumflex is three bytes): strlen(string) - number of codepoints (Latin a + combining circumflex is two codepoints): strlen(substitute(string, '.', 'x', 'g')) - number of spacing codepoints (Latin a + combining circumflex is one spacing codepoint; a hard tab is one; wide and narrow CJK are one each; etc.): (untested) strlen(substitute(string, '.\Z', 'x', 'g')) - virtual length (counting, for instance, tabs as anything between 1 and 'tabstop', wide CJK as 2 rather than 1, Arabic alif as zero when immediately preceded by lam, one otherwise, etc.): I guess something like what you're doing above will be necessary because of the wide range of things that can happen. The first two above are documented at ":help strlen()", the third (in addition) at ":help patterns-composing". Best regards, Tony. --~--~---------~--~----~------------~-------~--~----~ You received this message from the "vim_dev" maillist. For more information, visit http://www.vim.org/maillist.php -~----------~----~----~----~------~----~------~--~---