Re: Pre-filtering for expensive query

Matt Ronge Wed, 03 Sep 2008 09:07:30 -0700


On Aug 30, 2008, at 3:01 PM, Paul Elschot wrote:

Op Saturday 30 August 2008 18:19:09 schreef Matt Ronge:

On Aug 30, 2008, at 4:43 AM, Karl Wettin wrote:

Can you tell us a bit more about what you custom query does?
Perhaps you can build the "candidate filter" and reuse it over and
over again?


I cannot reuse it. The candidate filter would be constructed by first
running a boolean query with a number of SHOULD clauses. So then I
know what docs atleast contain the terms I'm looking for. Once I have
this set, I will look at the ordering of the matches (it's a bit more
sophisticated than just a phrase query) and find the final matches.


Sounds like you may want to take a look at SpanNearQuery.

I'm going to take a second look at SpanNearQuery. I need it to supportoptional tokens, so I'm guessing I'll need to create a subclass to dothat.

Since my boolean clauses are different for each query I can't reuse
the filter.


With (a variation of) SpanNearQuery you may end up not needing
any filtering at all, because it already uses skipTo() where possible.

In case you are looking for documents that contain partial phrases
from an input query that has more than 2 words, have a look at Nutch.

I poked around in the Nutch docs and Javadocs, what should I look atin Nutch? What does it do exactly, is it the trick that Doug Cuttingmentioned where you concat neighboring terms together like "Helloworld" becomes the token hello.world?


Thanks for everyones help so far, I really appreciate it,
--
Matt

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Pre-filtering for expensive query

Reply via email to