Matt Quail wrote:

We have a system where I'll be given 10 or 20 unique keys.


I assume you mean you have one unique-key field, and you are given 10-20 values to find for this one field?


Internally I'm creating a new Term and then calling IndexReader.termDocs() on this term. Then if termdocs.next() matches then I'll return this document.


Are you calling reader.termDocs() inside a (tight) loop? It might be better to create one TermEnum, and use "seek". Something like this:

Yes.. this is another approach I was thinking of taking. I was thinking of building up a list of indexes which had a high probability of holding the given document and then searching for each of them.

What I'm worried about though is that it would be a bit slower... I'm just going to have to test out different implementations to see....

<snip>



I'm pretty sure that will work. And if you can avoid the multi- threading issues, you might try and use the same TermDocs object for as long as possible (that is, move it up out of as many tight loops as you can).

Well... that doesn't look like the biggest overhead. The bottleneck seens to be in seek() and the fact that its using an InputStream to read bytes off disk. I actually tried to speed that up by crainking InputSteam.BUFFER_SIZE var higher but that didn't work either though I'm not sure if its a caching issue. I sent an email to the list about this earlier but no one responded.

So it seems like my bottleneck is in seek() so It would make sense to figure out how to limit this.

Is this O(log(N))  btw or is it O(N) ?

Kevin

--


Use Rojo (RSS/Atom aggregator)! - visit http://rojo.com. See irc.freenode.net #rojo if you want to chat.

Rojo is Hiring! - http://www.rojonetworks.com/JobsAtRojo.html

  Kevin A. Burton, Location - San Francisco, CA
     AIM/YIM - sfburtonator,  Web - http://peerfear.org/
GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to