On 2014-09-12 14:39:35 -0700, Paul Eggert wrote: > On 09/12/2014 02:29 PM, Vincent Lefevre wrote: > >an option to control what happens on encoding errors would be > >better and sufficient. > > It might suffice for your use cases, but it's more complicated and less > flexible than being able to match bytes within the regular expression.
But IMHO, some solutions I proposed would be faster. I wonder whether anyone is interested in matching individual bytes in a file regarded as UTF-8 encoded. This seems weird. > Speaking of hairy, why doesn't grep use PCRE_MULTILINE? Using > PCRE_MULTILINE shouldn't be that hard, and should boost performance > quite a bit in typical usage. Or am I being too optimistic here? Perhaps in text files. In binary files, with the current solution, I don't think this matters as failures due to invalid bytes typically occur several times per line. -- Vincent Lefèvre <vinc...@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org