Re: C++ being filtered (please help)

Chris Hostetter Thu, 04 Feb 2010 12:21:10 -0800

: > now i want to tokenize it based on comma or white space and
: > other word
: > delimiting characters only. Not on the plus sign. so that
: > result after
: > tokenization should be
        ...
: > But the result I am getting is


...you haven't told us what type of analyzer settings you are currently 
using, so it's completley impossible to give you specific advice on what 
to do -- the problem may not be your current tokenizer at all, it might be 
some TokenFilter that is being applied after tokenization.

: <charFilter class="solr.MappingCharFilterFactory" mapping="mappings.txt" /> 
: <tokenizer class="solr.WhitespaceTokenizerFactory" /> 
: <filter class="solr.LowerCaseFilterFactory" />        

that's seems somewhat overkill for the problem of "i want to tokenize on 
an explicit list of characters" ... using the PAtternTokenizerFactory (in 
place of the MappingCharFilterFactory and the WhitespaceTokenizerFactory) 
would probably be a little more straight forward.

-Hoss

Re: C++ being filtered (please help)

Reply via email to