Re: Are analysers applied to each value in a multi-valued field separately?

Jack Krupansky Tue, 16 Jul 2013 06:17:12 -0700

Yes, each input value is analyzed separately. Solr passes each input valueto Lucene and then Lucene analyzes each.

You could use LimitTokenPositionFilterFactory which uses the absolute tokenposition - each successive analyzed value would have an incrementedposition, plus the positionIncrementGap (typically 100 for text.)


-- Jack Krupansky

-----Original Message-----From: Daniel Collins

Sent: Tuesday, July 16, 2013 8:46 AM
To: solr-user@lucene.apache.org

Subject: Are analysers applied to each value in a multi-valued fieldseparately?


I'm guessing the answer is yes, but here's the background.

We index 2 separate fields, headline and body text for a document, and then
we want to identify the "top" of the story which is th headline + N words
of the body (we want to weight that in scoring).

So do to that:

<copyField src="headline" dest="top"/>
<copyField src="body" dest="top"/>

And the "top" field has a LimitTokenCountFilterFactory appended to it to do
the limiting.

       <filter class="solr.LimitTokenCountFilterFactory"
maxTokenCount="N"/>

I realised that top needs to be multi-valued, which got me thinking: is
that N tokens PER VALUE of top or N tokens in total within the top field...
The field is indexed but not stored, so its hard to determine exactly
which is being done.

Logically, I presume each value in the field is independent (and Solr then
just matches searches against each one), so that would suggest N is per
value?

Cheers, Daniel

Re: Are analysers applied to each value in a multi-valued field separately?

Reply via email to