Well, what effect do you _want_? I'd probably put it after the PorterStemFilterFactory. As it is, it'll form a bunch of ngrams, then WordDelimiterFilterFactory will try to break them up according to _its_ rules and eventually you'll be sending absolute gibberish to the stemmer. I mean what is the stemmer going to think of (starting out with running) ru, run, runn, runni, runnin, running?
I suggest you spend some time with admin/analysis with various orderings to understand better how all the parts interact. Best Erick On Tue, Apr 24, 2012 at 11:20 AM, geeky2 <gee...@hotmail.com> wrote: > hello all, > > i want to experiment with the EdgeNGramFilterFactory at index time. > > i believe this needs to go in post tokenization - but i am doing a pattern > replace as well as other things. > > should the EdgeNGramFilterFactory go in right after the pattern replace? > > > > > <fieldType name="text_en_splitting" class="solr.TextField" > positionIncrementGap="100"> > <analyzer type="index"> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > > > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords.txt" enablePositionIncrements="true"/> > <filter class="solr.PatternReplaceFilterFactory" pattern="\." > replacement="" replace="all"/> > > *put EdgeNGramFilterFactory here ===> ?* > > <filter class="solr.WordDelimiterFilterFactory" > generateWordParts="1" generateNumberParts="1" catenateWords="1" > catenateNumbers="1" catenateAll="1" splitOnCaseChange="1" > preserveOriginal="1"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.KeywordMarkerFilterFactory" > protected="protwords.txt"/> > <filter class="solr.PorterStemFilterFactory"/> > </analyzer> > <analyzer type="query"> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords.txt" enablePositionIncrements="true"/> > <filter class="solr.PatternReplaceFilterFactory" pattern="\." > replacement="" replace="all"/> > <filter class="solr.WordDelimiterFilterFactory" > generateWordParts="1" generateNumberParts="1" catenateWords="0" > catenateNumbers="0" catenateAll="0" splitOnCaseChange="1" > preserveOriginal="1"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.KeywordMarkerFilterFactory" > protected="protwords.txt"/> > <filter class="solr.PorterStemFilterFactory"/> > </analyzer> > </fieldType> > > thanks for any help, > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/correct-location-in-chain-for-EdgeNGramFilterFactory-tp3935589p3935589.html > Sent from the Solr - User mailing list archive at Nabble.com.