Markus Kuhn wrote:
The way in which Perl supports Unicode, you normally should hardly ever
have to call a UTF-8 encoder or decoder explicitely and manually. You
just have to make sure that when a UTF-8 string enters Perl, it does so
tagged as a UTF-8 string and not as an octet string. How that happ
Hey guys, I have a question relating to perl's unicode support.
I have a perl script that does a lot of processing of filenames that
have unicode characters in them, and it's behaved really wonky. I've
relied on a function called 'decode_utf8' from a package called "Encode"
to help sort things
Markus Kuhn wrote:
Your results show that grep in UTF-8 mode is equally 100x slower than in
single-byte mode, just like on my system (300 MHz P3). You just have
used a faster CPU.
D'oh :)
--
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/
Markus Kuhn wrote:
On Red Hat 9:
$ grep --version
grep (GNU grep) 2.5.1
$ LC_ALL=en_GB.UTF-8 time grep XYZ test.txt
Command exited with non-zero status 1
6.83user 0.07system 0:06.93elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (157major+34minor)pagefaults 0swaps
$ LC_ALL=POSIX