On 2021/04/06 13:09, Martijn van Duren wrote:
> I´m also not convinced that the other wcwidth implementations might be
> on to something and that the unicode consortium is having inertia
> problems.

The difficulty is that it isn't *possible* to give a single correct
answer for the width of SHY, it varies and can only be identified
when other information about the terminal is taken into account (how
the terminal behaves and whether the word currently printed is being
wrapped), which is out of scope for wcwidth(3). So no surprise
different people come up with a different way to handle it.

> If you want to show a hyphen in your text, use a hyphen. If you want to
> indicate where a word might be broken up in a hyphenated way across two
> lines if the software knows the localized grammar rules use a SHY.
> Also thanks to sthen@ for pointing out where the confusion comes from:
> we´re using UTF-8 here, not ISO-8859-1, so we must make sure that we
> use the UTF-8 definitions.

but, guess what happens when text is converted from ISO-8859-1 to UTF-8...

$ printf '\xad' | iconv -f iso-8859-1 -t utf-8 | hexdump -C
00000000  c2 ad                                             |..|

Reply via email to