On 11/27/2016 10:57 PM, Jim Meyering wrote: > When grep is configured --with-included-regex, the following command > fails to print the expected match: > > printf '\351\n' |LC_ALL=fr_FR.iso88591 src/grep '[d-f]'
But the problem is that POSIX does NOT define what the "expected match" should be. The very fact that you're using a non-C locale but passing a range means that you have unspecified behavior per POSIX. Some regex engines treat 'e' and 'e-acute' as both being part of the range, others treat only 'e' as being part of the range. Expecting any particular behavior is a bug, unless you know for sure that you are using GNU's "rational range behavior" which explicitly treats ranges in ALL locales the same as if they were in the C locale (that is, e-acute is never part of the [d-f] range under rational range behavior). > > Since it's always been this way, I don't plan to attempt a work-around > before the next release, and instead will probably arrange for that > test to be skipped when grep is built with the included regex. > > Other ideas welcome, We SHOULD be adjusting more and more GNU tools to honor rational range behavior, at least as an option, even if that means that e-acute can never be matched to [d-f]. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
signature.asc
Description: OpenPGP digital signature