Payload doesn't apply to WordDelimiterFilterFactory-generated tokens

Lox Mon, 04 Jul 2011 00:26:05 -0700

Hi, I have a problem with the WordDelimiterFilterFactory and the
DelimitedPayloadTokenFilterFactory.
It seems that the payloads are applied only to the original word that I
index and the WordDelimiterFilter doesn't apply the payloads to the tokens
it generates.


For example, imagine I index the string JavaProject|1.7, 
at the end of my analyzer pipeline will be transformed like this:
JavaProject|1.7 -----> javaproject|1.7 java project

Instead, what I would is a result like this:
JavaProject|1.7 -----> javaproject|1.7 java|1.7 project|1.7

This way the payload would be applied to the document even in case of
partial matches on the original word.
Now I have used the pipe notation but imagine those payloads already stored
in solr internally.

How can I do this?

If it is needed, my analyzer looks like this:
<fieldType name="text_C" class="solr.TextField" positionIncrementGap="100"
stored="false" indexed="true">
      <analyzer type="index">           
                <tokenizer class="solr.WhitespaceTokenizerFactory"/>
                <filter class="solr.DelimitedPayloadTokenFilterFactory" 
encoder="float"/>
                <filter class="solr.PatternReplaceFilterFactory"
                pattern="^[a-z]{2,5}[0-9]{1,4}?([.]|[a-z])?(.*)"
replacement="" replace="all" />
                <filter class="solr.WordDelimiterFilterFactory" 
preserveOriginal="1"
generateNumberParts="1"/>
                <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt"  enablePositionIncrements="true" />       
        <filter class="solr.TrimFilterFactory" />       
                <filter class="solr.LowerCaseFilterFactory"/>
                <filter class="solr.LengthFilterFactory" min="1" max="30" />
                <filter class="solr.SnowballPorterFilterFactory" 
language="English"
protected="protwords.txt"/>
          </analyzer>
                .
                .
                .

Thank you.


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Payload-doesn-t-apply-to-WordDelimiterFilterFactory-generated-tokens-tp3136748p3136748.html
Sent from the Solr - User mailing list archive at Nabble.com.

Payload doesn't apply to WordDelimiterFilterFactory-generated tokens

Reply via email to