The word delimiter filter is remmoving special characters. You can add a file containing a list of the special characters that you wish to treat as alpha, using the "type" parameter.
-- Jack Krupansky On Mon, Jul 13, 2015 at 6:43 PM, Steven White <swhite4...@gmail.com> wrote: > Hi Everyone, > > I think the subject line said it all. Here is the schema I'm using: > > <fieldType name="my_text" class="solr.TextField" positionIncrementGap="100" > autoGeneratePhraseQueries="true"> > <analyzer> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="lang/stopwords_en.txt"/> > <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" > generateNumberParts="1" catenateWords="1" catenateNumbers="1" > catenateAll="1" splitOnCaseChange="0" splitOnNumerics="1" > stemEnglishPossessive="1" preserveOriginal="1"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/> > <filter class="solr.PorterStemFilterFactory"/> > <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> > </analyzer> > </fieldType> > > I'm guessing this is due to how solr.WhitespaceTokenizerFactory works and > those that it is not indexing are removed because they are considered > "white-spaces"? If so, how can I include %, &, etc. into this none-indexed > list? I would rather see all these not indexed vs some are and some are > not causing confusion to my users. > > Thanks > > Steve >