Re: order of analyzers, tokeinizers and filters

Rafał Kuć Tue, 14 Sep 2010 05:08:02 -0700

Hello!

   Tokenizer is executed before filters, because tokenizer is
"generating" tokens and than filters operate on them.


> hi,
> it's the second time i am stumble across some strange behaviour:

> in my schema.xml i have defined 

>     <fieldType name="textspell" class="solr.TextField"
> positionIncrementGap="100">
>       <analyzer type="index">
>         <!-- sg324 inkl. HTMLStrip... -->
>         <charFilter class="solr.HTMLStripCharFilterFactory" />
>         <filter class="solr.PatternReplaceFilterFactory" pattern="/"
> replacement=" / " replace="all"/>
>         <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>         <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords.txt" enablePositionIncrements="true" />
>         <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="1" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="0" splitOnCaseChange="0"/>
>         <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords_spelling.txt" enablePositionIncrements="true" />
>         <filter class="solr.LowerCaseFilterFactory"/>
>       </analyzer>

> i can't place the PatternReplaceFilter before the WhitespaceTokenizer. i
> have the schema like above, did a reload of my core, but
> when i go to analyze in the admin i can see that the WhiteSpaceTokenizer
> is executed before the PatternReplaceFilter.

> is there a general order of execution?

> markus





-- 
Regards,
 Rafał Kuć

Re: order of analyzers, tokeinizers and filters

Reply via email to