Re: weighted search and index

Lance Norskog Wed, 03 Mar 2010 21:08:45 -0800

Boosting by convention is "flat" at 1.0. Usually people boost with
numbers like 3 or 5 or 20.


On Wed, Mar 3, 2010 at 6:34 PM, Jianbin Dai <j...@huawei.com> wrote:
> Hi Erick,
>
> Each doc contains some keywords that are indexed. However each keyword is
> associated with a weight to represent its importance. In my example,
> D1: fruit 0.8, apple 0.4, banana 0.2
>
> The keyword fruit is the most important keyword, which means I really really
> want it to be matched in a search result, but banana is less important (It
> would be good to be matched though).
>
> Hope that explains.
>
> Thanks.
>
> JB
>
>
>
> -----Original Message-----
> From: Erick Erickson [mailto:erickerick...@gmail.com]
> Sent: Wednesday, March 03, 2010 6:23 PM
> To: solr-user@lucene.apache.org
> Subject: Re: weighted search and index
>
> Then I'm totally lost as to what you're trying to accomplish. Perhaps
> a higher-level statement of the problem would help.
>
> Because no matter how often I look at your point <2>, I don't see
> what relevance the numbers have if you're not using them to
> boost at index time. Why are they even there?
>
> Erick
>
> On Wed, Mar 3, 2010 at 8:54 PM, Jianbin Dai <j...@huawei.com> wrote:
>
>> Thank you very much Erick!
>>
>> 1. I used boost in search, but I don't know exactly what's the best way to
>> boost, for such as Sports 0.8, golf 0.5 in my example, would it be
>> sports^0.8 AND golf^0.5 ?
>>
>>
>> 2. I cannot use boost in indexing. Because the weight of the value
> changes,
>> not the field, look at this example again,
>>
>> C1: fruit 0.8, apple 0.4, banana 0.2
>> C2: music 0.9, pop song 0.6, Britney Spears 0.4
>>
>> There is no good way to boost it during indexing.
>>
>> Thanks.
>>
>> JB
>>
>>
>> -----Original Message-----
>> From: Erick Erickson [mailto:erickerick...@gmail.com]
>> Sent: Wednesday, March 03, 2010 5:45 PM
>> To: solr-user@lucene.apache.org
>> Subject: Re: weighted search and index
>>
>> You have to provide some more details to get meaningful help.
>>
>> You say "I was trying to use boosting". How? At index time?
>> Search time? Both? Can you provide some code snippets?
>> What does your schema look like for the relevant field(s)?
>>
>> You say "but seems not working right". What does that mean? No hits?
>> Hits not ordered as you expect? Have you tried putting "&debugQuery=on" on
>> your URL and examined the return values?
>>
>> Have you looked at your index with the admin page and/or Luke to see if
>> the data in the index is as you expect?
>>
>> As far as I know, boosts are multiplicative. So boosting by a value less
>> than
>> 1 will actually decrease the ranking. But see the Lucene scoring, See:
>>
>>
> http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/Similarity.
>>
> html<http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/Simila
> rity.%0Ahtml>
>>
>> And remember, that boosting will *tend* to move a hit up or down in the
>> ranking, not position it absolutely.
>>
>> HTH
>> Erick
>>
>> On Wed, Mar 3, 2010 at 8:13 PM, Jianbin Dai <j...@huawei.com> wrote:
>>
>> > Hi,
>> >
>> > I am trying to use solr for a content match application.
>> >
>> > A content is described by a set of keywords with weights associated,
> eg.,
>> >
>> > C1: fruit 0.8, apple 0.4, banana 0.2
>> > C2: music 0.9, pop song 0.6, Britney Spears 0.4
>> >
>> > Those contents would be indexed in solr.
>> > In the search, I also have a set of keywords with weights:
>> >
>> > Query: Sports 0.8, golf 0.5
>> >
>> > I am trying to find the closest matching contents for this query.
>> >
>> > My question is how to index the contents with weighted scores, and how
> to
>> > write search query. I was trying to use boosting, but seems not working
>> > right.
>> >
>> > Thanks.
>> >
>> > Jianbin
>> >
>> >
>> >
>>
>>
>
>



-- 
Lance Norskog
goks...@gmail.com

Re: weighted search and index

Reply via email to