Re: Question About Boosting.
Buckets it is :) Thx On 3/12/07, Chris Hostetter <[EMAIL PROTECTED]> wrote: : I thought about this option but it doesn't sound scalable. What : happens if i have 100 words with 100 different boost factors? then you've got a problem :) typically it's not this severe ... i'll frequently have half a dozen fields that i divide text up into to boost on different amounts, but i'm having a hard time understanding why you would need 100 unique boost factors for 100 unique words ... putting things buckets tends be effective. -Hoss
Re: Question About Boosting.
: I thought about this option but it doesn't sound scalable. What : happens if i have 100 words with 100 different boost factors? then you've got a problem :) typically it's not this severe ... i'll frequently have half a dozen fields that i divide text up into to boost on different amounts, but i'm having a hard time understanding why you would need 100 unique boost factors for 100 unique words ... putting things buckets tends be effective. -Hoss
Re: Question About Boosting.
I thought about this option but it doesn't sound scalable. What happens if i have 100 words with 100 different boost factors? On 3/12/07, Chris Hostetter <[EMAIL PROTECTED]> wrote: : I have elements within a field that have different importance. : I thought boosting would be an elegant way to take this into account. : Please advise, typically if you know when sending hte doc to solr that certian words/phrases of field A are extremely significant for that document, the simple approach is to also put those words/phrases in some other field "B" and at query time search both A and B .. since B tends to have less words anyway, itmakes more of an impact on teh results, but if you want those words to be *really* important boost your queries on B. The dismax handler makes quering across these multiple fields very easy. -Hoss
Re: Question About Boosting.
: I have elements within a field that have different importance. : I thought boosting would be an elegant way to take this into account. : Please advise, typically if you know when sending hte doc to solr that certian words/phrases of field A are extremely significant for that document, the simple approach is to also put those words/phrases in some other field "B" and at query time search both A and B .. since B tends to have less words anyway, itmakes more of an impact on teh results, but if you want those words to be *really* important boost your queries on B. The dismax handler makes quering across these multiple fields very easy. -Hoss
Re: Question About Boosting.
On 3/11/07, shai deljo <[EMAIL PROTECTED]> wrote: Thanks, The only way i found to do this (http://www.mail-archive.com/solr-user@lucene.apache.org/msg02456.html) is to hack and repeat the word several times in the field, but doesn't this screw up the norms? Yes, it can influence the norms. Also, how do i boost words in a query? e.g. q=key1 key2 and i know key2 is twice as important than key1 ? (searching 1 field). q=key1 key2^2 If the keywords that have more importance are the same for every document, query-time boosting is by far the more preferable route. You have much more flexibility and it isn't less performant. There are some things which are elegantly solved using index-time boosting, and so it is likely that lucene will support it one day. -Mike
Re: Question About Boosting.
Thanks, The only way i found to do this (http://www.mail-archive.com/solr-user@lucene.apache.org/msg02456.html) is to hack and repeat the word several times in the field, but doesn't this screw up the norms? Also, how do i boost words in a query? e.g. q=key1 key2 and i know key2 is twice as important than key1 ? (searching 1 field). Thanks, S. On 3/11/07, Walter Underwood <[EMAIL PROTECTED]> wrote: Back up another step. What are the documents and what do you want to show to the users? Have you tried the default configuration with real user queries? After you've tested it with user queries, then look at the results where the ranking isn't performing well. Lucene and Solr already automatically boost rare terms over common terms, using tf.idf weighting. I posted more detail on this in my blog last summer: http://wunderwood.org/most_casual_observer/2006/06/good_to_great_search.html wunder On 3/10/07 8:04 PM, "shai deljo" <[EMAIL PROTECTED]> wrote: > I have elements within a field that have different importance. > I thought boosting would be an elegant way to take this into account. > Please advise, > > > On 3/10/07, Walter Underwood <[EMAIL PROTECTED]> wrote: >> What are you trying to achieve? Let's start with the problem >> instead of picking one solution which Solr doesn't support. --wunder >> >> On 3/10/07 5:08 PM, "shai deljo" <[EMAIL PROTECTED]> wrote: >> >>> How can i boost some tokens over others in the same field (at Index >>> time) ? If this is not supported directly, what's the best way around >>> this problem (what's the hack to solve this :) ). >>> Thanks, >>> Shai >> >>
Re: Question About Boosting.
Back up another step. What are the documents and what do you want to show to the users? Have you tried the default configuration with real user queries? After you've tested it with user queries, then look at the results where the ranking isn't performing well. Lucene and Solr already automatically boost rare terms over common terms, using tf.idf weighting. I posted more detail on this in my blog last summer: http://wunderwood.org/most_casual_observer/2006/06/good_to_great_search.html wunder On 3/10/07 8:04 PM, "shai deljo" <[EMAIL PROTECTED]> wrote: > I have elements within a field that have different importance. > I thought boosting would be an elegant way to take this into account. > Please advise, > > > On 3/10/07, Walter Underwood <[EMAIL PROTECTED]> wrote: >> What are you trying to achieve? Let's start with the problem >> instead of picking one solution which Solr doesn't support. --wunder >> >> On 3/10/07 5:08 PM, "shai deljo" <[EMAIL PROTECTED]> wrote: >> >>> How can i boost some tokens over others in the same field (at Index >>> time) ? If this is not supported directly, what's the best way around >>> this problem (what's the hack to solve this :) ). >>> Thanks, >>> Shai >> >>
Re: Question About Boosting.
I have elements within a field that have different importance. I thought boosting would be an elegant way to take this into account. Please advise, On 3/10/07, Walter Underwood <[EMAIL PROTECTED]> wrote: What are you trying to achieve? Let's start with the problem instead of picking one solution which Solr doesn't support. --wunder On 3/10/07 5:08 PM, "shai deljo" <[EMAIL PROTECTED]> wrote: > How can i boost some tokens over others in the same field (at Index > time) ? If this is not supported directly, what's the best way around > this problem (what's the hack to solve this :) ). > Thanks, > Shai
Re: Question About Boosting.
What are you trying to achieve? Let's start with the problem instead of picking one solution which Solr doesn't support. --wunder On 3/10/07 5:08 PM, "shai deljo" <[EMAIL PROTECTED]> wrote: > How can i boost some tokens over others in the same field (at Index > time) ? If this is not supported directly, what's the best way around > this problem (what's the hack to solve this :) ). > Thanks, > Shai