On Tue, Sep 05, 2023 at 09:09:18PM +0300, Eli Zaretskii wrote: > > Date: Tue, 5 Sep 2023 20:01:53 +0200 > > From: Patrice Dumas <[email protected]> > > > > Currently, when counting the width of a line of character, we count > > control characters that are also spaces as having a width of 1. I think > > that it is not good, as control characters either should not have a > > width, for end of line, form feed, carriage return, or have a width that > > is not well defined for vertical and horizontal tab. I suggest to > > consider all the control characters as having a width of 0. This will > > be consistent with libunistring u8_strwidth, which I intend to use in C > > code equivalent to perl code. > > Please define "control characters" for this purpose. Some of them are > definitely not zero-width, for example, TAB.
Characters whose unicode codepoints in decimal are in the range 0 to 31, and also 127 (Delete). This includes the horizontal tab. It corresponds to the [:cntrl:] character class. > Also, depending on how control characters are displayed, their width > could be even 4, for example if they are displayed as \nnn octal > escapes. It is in a context where they are displayed as encoded bytes. -- Pat
