'grep' is conforming to its specification, even though it's not as useful as it might be when searching German text. The situation with 'ß'/'SS' is different than the situation with 'lj'/'Lj'/'LJ' because in the latter case 'grep' is dealing only with individual characters.

There's a related issue with 'ß' versus the recently-introduced capital sharp-S 'ẞ'. These do not match each other with 'grep --ignore-case' in the current savannah git master. This is an unfortunate property of how the glibc regex code behaves: the regex code uppercases both pattern and data before comparing, but in the standard German locale 'ß' is unchanged by uppercasing.

I'll leave this bug open as it is an awkward situation. Fixing it would require changing the glibc regex code, which is a big deal -- it would have some performance implications in a lot of programs. So I'm not optimistic about fixing it any time soon.



Reply via email to