The mapping in the ICUFoldingFilter has been pretty static, as far as I know, 
and I'm wondering whether there's any willingness to apply adjustments.


At the Cornell Library, we're finding ourselves having to make separate 
PatternReplace mapping rules because of specific characters that should be 
handled by the filter's punctuation folding being handled by the diacritic 
removal instead. The result is extremely troublesome searching behavior for 
Romanized Arabic text, at least, and probably creates issues with other 
languages. I have conferred with a college at the Columbia University Library, 
who has had to make similar adjustments to the mapping.


I would ideally like to see these changes made at the source, which would 
simplify my configuration, and improve searching for everyone depending on this 
filter to take care of the messy international characters out of the box. Would 
such an update be likely to be accepted, or is the filter wed to its current 
configuration?


Thanks,


Frances


Frances Webb

Developer

Cornell University Library

Reply via email to