Hi,

on solr 4.7 I've ran into a strange issue. Whilst setting up a field I've
noticed in the analysis form when I use a char filter factory (for example
HTMLSCF) with a tokeniser (ST) the analysis chain grinds to a halt. the
char filter does not seem to pass anything into the tokeniser.

Field type is:

<fieldType name="clean_text" class="solr.TextField"
positionIncrementGap="100">
              <analyzer>
                <charFilter class="solr.HTMLStripCharFilterFactory"/>
                <tokenizer class="solr.StandardTokenizerFactory"/>
                <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0" splitOnCaseChange="0"/>
                <filter class="solr.LowerCaseFilterFactory"/>
                <filter class="solr.SnowballPorterFilterFactory"
language="English"/>
              </analyzer>
    </fieldType>

outpout of the analysis screen is:

Field value (index)
Content with mark up <br /> should be cleaned

HTMLSCF > Content with mark up should be cleaned
ST > <BLANK>

I know I must be missing something obvious !

Cheers Lee C
...

Reply via email to