Hi all, I'm trying to add a CachingTokenFilter derived filter to the index analyzer chain for field "text". I need to work with CachingTokenFilter because I need to look-ahead in the token stream (my filter is a "stop phrases" filter, where I look ahead in the index to see if a stop phrase is found and then remove it from the token stream).
When I test the correctness of the chain using this query: /solr/analysis/field?analysis.fieldname=description&analysis.fieldtype=text&analysis.fieldvalue=... everything seems ok (I see that the stop phrases are removed from the token stream). But when I index documents, the index is totally empty: all searches on "text" fields give no results at all! Here is my index chain, where StopPhrasesFilterFactory is my custom filter which derives from CachingTokenFilter: <fieldType name="text" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <!-- in this example, we will only use synonyms at query time <filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/> --> <!-- Case insensitive stop word removal. add enablePositionIncrements=true in both the index and query analyzers to leave a 'gap' for more accurate phrase queries. --> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" /> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="org.apache.solr.analysis.StopPhrasesFilterFactory"/> <filter class="solr.SnowballPorterFilterFactory" language="Italian" protected="protwords.txt"/> <analyzer type="query"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" /> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.SnowballPorterFilterFactory" language="Italian" protected="protwords.txt"/> </analyzer> </fieldType> Is it wrong to use CachingTokenFilter in the index chain? Regards Enrico