Thank you both, Umesh Prasad and Anshum.
You've been a great help.


-----Original Message-----
From: Anshum [mailto:[email protected]] 
Sent: Sunday, January 16, 2011 5:26 PM
To: [email protected]
Subject: Re: Result ordering

Hi Pelit,
Firstly, number of words that match a query in a document is not term 
frequency. You may get some more idea on the terminologies used in search at 
http://www.miislita.com/term-vector/term-vector-3.html

Looking at what you're trying to achieve, a few solutions to you would be.
Below, I am assuming your query to be for terms "A B":
1. Look at phrase queries and setting a high slop value so matches like "A B" 
are weighted more than "A .... B" (separated by a few positions).
2. Also you could have a custom similarity (advanced) 
http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/SimilarityDelegator.html
and
use the coord value.
3. For wildcard searches, a brute force mechanism could be, you may have an or 
query  (finally the query is expanded to an OR query anyways) for RING OR
RING* and boost the former part.

Looking at your current query seems like you'd need more understanding on 
lucene and getting a copy of "Lucene In Action 2nd 
Ed<http://www.manning.com/hatcher3/>."
would be  a good idea for you and everyone in your position.
Hope that helps.

--
Anshum Gupta
http://ai-cafe.blogspot.com


On Sun, Jan 16, 2011 at 8:03 PM, Pelit Mamani
<[email protected]>wrote:

> Hi,
>
> I'm maintaining some Lucene-based code, and we're trying to get 
> control over result ordering (users aren't happy with the default).
> I know how to boost a Field or Document (very useful).
> But:
>
>
> 1)      Is there a way to boost "OR" queries, based on the number of
> matched terms?
> So the OR query "lord rings" will first show the document "LORD of the 
> RINGS" (which holds both words), and only later "selected jewels and RINGS"
> (which only holds one word).
> Is that what you call "Term Frequency"? And how do you boost it further?
> I did a bit of tinkering and got the impression Lucene would boost it 
> by default, but not enough - it's sometimes overridden by other 
> boosting factors (maybe the boost for short expressions).
>
>
> 2)      Is there a way to boost based on "positions"?
> So "LORD of the RINGS" gets precedence over "LORD of the funny golden 
> RINGS", because the search words are positioned closer to each other?
>
>
> 3)      With wildcard searches, is there a way to boost documents that hold
> an exact match.
> So if I search for "ring*", I first see the exact match "story of a 
> RING", and only later "a RINGING failure"
>
> Thanx a bunch.
>
>
>
>

 
 
************************************************************************************
This footnote confirms that this email message has been scanned by PineApp 
Mail-SeCure for the presence of malicious code, vandals & computer viruses.
************************************************************************************




---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to