Re: how does Solr/Lucene index multi-value fields

Ian Holsman Tue, 31 May 2011 10:15:35 -0700

Thanks Erick.

sadly in my use-case I don't that wouldn't work. I'll go back to storing them 
at the story level, and hitting a DB to get related stories I think.


--I
On May 31, 2011, at 12:27 PM, Erick Erickson wrote:

> Hmmm, I may have mis-lead you. Re-reading my text it
> wasn't very well written....
> 
> TF/IDF calculations are, indeed, per-field. I was trying
> to say that there was no difference between storing all
> the data for an individual field as a single long string of text
> in a single-valued field or as several shorter strings in
> a multi-valued field.
> 
> Best
> Erick
> 
> On Tue, May 31, 2011 at 12:16 PM, Ian Holsman <had...@holsman.net> wrote:
>> 
>> On May 31, 2011, at 12:11 PM, Erick Erickson wrote:
>> 
>>> Can you explain the use-case a bit more here? Especially the post-query
>>> processing and how you expect the multiple documents to help here.
>>> 
>> 
>> we have a collection of related stories. when a user searches for something, 
>> we might not want to display the story that is most-relevant (according to 
>> SOLR), but according to other home-grown rules.  by combing all the 
>> possibilities in one SolrDocument, we can avoid a DB-hit to get related 
>> stories.
>> 
>> 
>>> But TF/IDF is calculated over all the values in the field. There's really no
>>> difference between a multi-valued field and storing all the data in a
>>> single field
>>> as far as relevance calculations are concerned.
>>> 
>> 
>> so.. it will suck regardless.. I thought we had per-field relevance in the 
>> current trunk. :-(
>> 
>> 
>>> Best
>>> Erick
>>> 
>>> On Tue, May 31, 2011 at 11:02 AM, Ian Holsman <had...@holsman.net> wrote:
>>>> Hi.
>>>> 
>>>> I want to store a list of documents (say each being 30-60k of text) into a 
>>>> single SolrDocument. (to speed up post-retrieval querying)
>>>> 
>>>> In order to do this, I need to know if lucene calculates the TF/IDF score 
>>>> over the entire field or does it treat each value in the list as a unique 
>>>> field?
>>>> 
>>>> If I can't store it as a multi-value, I could create a schema where I put 
>>>> each document into a unique field, but I'm not sure how to create the 
>>>> query to search all the fields.
>>>> 
>>>> 
>>>> Regards
>>>> Ian
>>>> 
>>>> 
>> 
>>

Re: how does Solr/Lucene index multi-value fields

Reply via email to