Hi Michael,

* Michael G Schwern <schw...@pobox.com> [2007-11-19 02:25]:
> nroff (or rather groff) replaced all the ASCII single quotes in
> the file with fancy Unicode x2019 quotes.

it doesn't stop there. Guess what it does with double hyphens? No
really! It converts them into em-dashes. Aaaaaaaaaaah!

    $ grep man ~/.bashrc
    alias man='LC_CTYPE=C man'

Reminds me, this is not the only GNU tool that needs such
treatment. GNU grep pays attention to the locale as well, but its
encoding decoder is apparently written in Visual Basic -- if you
use a UTF-8 locale, it will slow down by TWO ORDERS OF MAGNITUDE.

    $ time LC_CTYPE=en_US.utf8 grep -cq tes /usr/share/dict/words 

    real    0m0.686s
    user    0m0.680s
    sys     0m0.004s

    $ time LC_CTYPE=C grep -cq tes /usr/share/dict/words 

    real    0m0.006s
    user    0m0.004s
    sys     0m0.000s

Regards,
-- 
Aristotle Pagaltzis // <http://plasmasturm.org/>

Reply via email to