> If i've given differnet advice in the past, I'm sure i had a good reason > for -- possible due to some aspect of those problems that are subtly > differnet then yours ... can you post links to hte specific messages > you're refering to, it might help jog my memory.
One thread is: http://www.nabble.com/synonyms-td16284520.html Based on my reading of that thread, I believe that the issue raised there is the same as the one I just raised, but the original post was not entirely clear and perhaps easy to misunderstand. Another thread is: http://www.nabble.com/stemming-the-synonyms-to16945953.html#a16945953 > A recently added feature is that when configuring SynonymFilterFactory > you can give it the name of a TokenizerFactory to use when parsing the > synonym file. This could be used to stem words *if* you write a > TokenizerFactory that calls out to your Stemmer. Ah, cool. I will give the SOLR 1.3 nightlies a spin, once I make it past my current deadlines and obligations. > (see SOLR-319 for the backround on why you can only specify a Tokenizer > and not a full "fieldType" to get the analysis chain from ... in a > nutshell: 1. it would have been harder to implement; 2. the only use cases > people could think of where Tokenization based.) There probably needs to be a chain of tokenizers, because in the German language compound words need to be split before stemming. I will take a stab at writing the TokenizerFactory that chains them. Should not be too difficult. Best regards - Christian