Hi.
Any good protwords.txt out there?
In a fairly standard solr analyzer chain, we use the English Porter analyzer
like so:
For most purposes the porter does just fine, but occasionally words come along
that really don't work out to well, e.g.,
"maine" is stemmed to "main" - clearly goofing
Hi.
I have encountered a problem searching in my application because of
inconsistant unicode normalization forms in the corpus (and the queries). I
would like to normalize to form NFKD in an analyzer (I think). I was thinking
about creating a filter similar to the lowercasefilter that would do