Charles E Campbell Jr wrote: > Tony Mechelynck wrote: > >> It all depends on what exactly you want to do. (I haven't read the Align.vim >> docs.) The length of a UTF-8 string can be counted in several nonequivalent >> ways: >> >> - number of bytes (Latin a + combining circumflex is three bytes): >> strlen(string) >> >> - number of codepoints (Latin a + combining circumflex is two codepoints): >> strlen(substitute(string, '.', 'x', 'g')) >> >> - number of spacing codepoints (Latin a + combining circumflex is one >> spacing >> codepoint; a hard tab is one; wide and narrow CJK are one each; etc.): >> (untested) >> strlen(substitute(string, '.\Z', 'x', 'g')) >> >> - virtual length (counting, for instance, tabs as anything between 1 and >> 'tabstop', wide CJK as 2 rather than 1, Arabic alif as zero when immediately >> preceded by lam, one otherwise, etc.): I guess something like what you're >> doing above will be necessary because of the wide range of things that can >> happen. >> >> The first two above are documented at ":help strlen()", the third (in >> addition) at ":help patterns-composing". >> >> > Thank you, Tony, for that explanation! I've modified Align so that the > method used is selectable by the user. Align v33d available at my > website (http://mysite.verizon.net/astronaut/vim/index.html#ALIGN) with > these changes. > > Regards, > Chip Campbell
... and, in addition, when 'fileencoding' is nonempty and different from 'encoding', the number of disk bytes used might be useful, but I don't know how Vim could get it, especially for encodings such as those used in Eastern Asia, where the number of bytes per character may vary in a way which is often not easily predictable from the UTF-8 representation. (The 2-or-4-bytes of UTF-16 is peanuts next to that, but Vim cannot use UTF-16 for its internal representation of the data because of the intervening nulls.) Best regards, Tony. -- Weiler's Law: Nothing is impossible for the man who doesn't have to do it himself. --~--~---------~--~----~------------~-------~--~----~ You received this message from the "vim_dev" maillist. For more information, visit http://www.vim.org/maillist.php -~----------~----~----~----~------~----~------~--~---