El 16/08/14 a las 18:26, Vincent Lefevre escribió: > On 2014-08-16 16:01:27 +0200, Santiago wrote: > > Workaround attached. It's too slow against binary files, but I haven't > > found a simpler solution. > > To avoid the slowness, I think that it would be better to detect > (directly, not via PCRE) invalid UTF-8 sequences and replace them > by null bytes *in-place*. > > It might slow down the general case, though. However I'm not sure, > because if the UTF8 validity check (via the replacement of invalid > sequences) is done in grep, it doesn't need to be done in PCRE. >
I think that'd require a similar work to replace the "invalid" content from binary files. Another solution would be to don't check if binary files are valid (passing PCRE_NO_UTF8_CHECK to pcre_exec), but I don't know if that'd avoid security holes, and I don't know how to do it either. Regards, Santiago -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org