Proximity and Percentage match search in Lucene

Radha Sreedharan Sat, 25 Apr 2009 02:48:43 -0700

  What I need is  the following :
> If my document field is ( ab,bc,cd,ef) and Search tokens are (ab,bc,cd).
>
>  Given the following :
>  I should get a hit even if all of the search tokens aren't present
>  If the tokens are found they should be found within a distance x of
> each other ( proximity
> search)
>>
>> I need the percentage match of the search tokens with the document field.
>>
>> Currently this is my query :
>> 1) I form all possible permutation of the search tokens
>> 2) do a spanNearQuery of each permutation
>> 3)  Do a DisjunctionMaxQuery on the spannearqueries.
>>
>> This is how I compute % match  :
>> % match =  ( Score by running the query on the document field ) /
>>              ( score by running the query on a document field created out of 
>> search
>> tokens )
>>
>> The numerator gives me the actual score with the search tokens run on the
>> field.
>> Denominator gives  me the  best possible or maximum possible score with
>> the current search
> tokens
>>
>> For this example << If my document field is ( ab,bc,cd,ef) and Search
>> tokens are
> (ab,bc,cd).>> I expect a % match of around 90%.
>>
>> However I get a match of only around 50% without a boost. Using a boost
>> infact reduces
> my percentage.
>>
>> I even overrode the queryNorm method to return a one, still the
>> percentage did not increase.
> *
> Is there any way of implementing this using the current set of
> implementation classes in Lucene and not making complex changes to the
> structure by itself.
> ( which is what i gather has to be done from the previous replies)
>
> Can anyone suggest an alternative way of implementing this requirement
> using the existing bunch of classes in Lucene and not necessarily
> using the ones I have used*


Regards,
Radha

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Proximity and Percentage match search in Lucene

Reply via email to