On Tue, Mar 01, 2005 at 09:27:18PM +0100, Andreas Schwab wrote: > It cannot do the former. There are encodings where bytes that look like > plain ASCII are in fact part of a multibyte sequence. But it may be > worthwhile to special case UTF-8 which does not have this problem.
I have patched for grep to make it lots faster with UTF-8, and have been trying to get them submitted upstream for ages. Also, the glibc regex implementation does perform the above optimisation (more or less) anyway, I believe. Tim. */
pgpKzjQEY7No3.pgp
Description: PGP signature
_______________________________________________ Bug-coreutils mailing list Bug-coreutils@gnu.org http://lists.gnu.org/mailman/listinfo/bug-coreutils