Charles E Campbell Jr wrote:
> Tony Mechelynck wrote:
>
>> It all depends on what exactly you want to do. (I haven't read the Align.vim
>> docs.) The length of a UTF-8 string can be counted in several nonequivalent
>> ways:
>>
>> - number of bytes (Latin a + combining circumflex is three bytes):
>> strlen(string)
>>
>> - number of codepoints (Latin a + combining circumflex is two codepoints):
>> strlen(substitute(string, '.', 'x', 'g'))
>>
>> - number of spacing codepoints (Latin a + combining circumflex is one
>> spacing
>> codepoint; a hard tab is one; wide and narrow CJK are one each; etc.):
>> (untested)
>> strlen(substitute(string, '.\Z', 'x', 'g'))
>>
>> - virtual length (counting, for instance, tabs as anything between 1 and
>> 'tabstop', wide CJK as 2 rather than 1, Arabic alif as zero when immediately
>> preceded by lam, one otherwise, etc.): I guess something like what you're
>> doing above will be necessary because of the wide range of things that can
>> happen.
>>
>> The first two above are documented at ":help strlen()", the third (in
>> addition) at ":help patterns-composing".
>>
>>
> Thank you, Tony, for that explanation! I've modified Align so that the
> method used is selectable by the user. Align v33d available at my
> website (http://mysite.verizon.net/astronaut/vim/index.html#ALIGN) with
> these changes.
>
> Regards,
> Chip Campbell
... and, in addition, when 'fileencoding' is nonempty and different from
'encoding', the number of disk bytes used might be useful, but I don't know
how Vim could get it, especially for encodings such as those used in Eastern
Asia, where the number of bytes per character may vary in a way which is often
not easily predictable from the UTF-8 representation. (The 2-or-4-bytes of
UTF-16 is peanuts next to that, but Vim cannot use UTF-16 for its internal
representation of the data because of the intervening nulls.)
Best regards,
Tony.
--
Weiler's Law:
Nothing is impossible for the man who doesn't have to do it
himself.
--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_dev" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---