Hi Preetam,

Questions like yours are better served in the java-user mailing list, which is 
devoted to Q&A about *using* Lucene, rather than here in the java-dev list, 
which is reserved for discussions concerning Lucene's *development*.

In the contrib/ area on the trunk (not yet part of any release), there is a 
class called ShingleFilter that might be useful to you - here is a link to the 
nightly javadoc for it:

<http://hudson.zones.apache.org/hudson/job/Lucene-trunk/javadoc/contrib-analyzers/org/apache/lucene/analysis/shingle/ShingleFilter.html>

If you create a field containing token n-grams (a.k.a. shingles) and use it as 
a component of your queries, I think you can achieve something similar to what 
you want.

Steve

On 07/14/2008 at 9:07 AM, Preetam Rao wrote:
> Hi,
> 
> Is there a query in Lucene which matches sub phrases ?
> 
> For example if the document text is "new york  existing homes 3 bed 2
> bath homes 3 miles from city center 2 rooms" and if user enters
> "Brooklyn homes with 3 bed rooms  and swimming pools", I would like to
> recognize the fact the the document contained a sub prefix of the user
> query and give it more score compared to a document which contained all
> the terms, but in correct order, for example, " new york 2 beds 3 baths".
> 
> This kind of query will be useful when we do not interpret or parse the
> user query. As seen in the example, it will prove useful when numbers
> are involved since numbers usually make sense with the term immediately
> following it.
> 
> This is something of a middle ground between pure 'boolean OR' query and
> a 'exact phrase query' as far as directly using the user query is
> concerned.
> 
> I have documented my thoughts in the below document and if there is
> nothing similar already implemented,
> will open a JIRA issue and work on it.
> 
> http://docs.google.com/Doc?id=dgzj3nsp_0z2j48hc6
> 
> Please let me know your thoughts on alternate solutions and approaches
> since it will be very useful for my current project.
> 
> Thanks
> Preetam

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to