On 27/11/2025 21:32, Collin Funk wrote:
Hi Pádraig,

Your coreutils i18n page mentions that 'tac' needs work to handle
multibyte characters [1]. Could you help me understand why it is listed
there?

My initial guess was that re_search does not work on multibyte
characters since it compares bytes. However, that still works for UTF-8:

    $ printf '1д2д3д' | tac --separator='д' && printf '\n'
    3д2д1д

I guess if we want it to work on other character sets, we can use the
fastmap in only unibyte locales or UTF-8. Does that sound correct? I am
not too familiar with the GNU regex functions.

Collin

[1] https://www.pixelbeat.org/docs/coreutils_i18n/

Oh I was mistaken.

I think I saw the --separator option and thought it would
have similar issues as the join -t option.

I've removed it from the page.

thanks,
Padraig

Reply via email to