Re: MultiSearcher vs MultiReader

Grant Ingersoll Sat, 21 Apr 2007 16:23:52 -0700


On Apr 20, 2007, at 3:08 PM, Kirk Roberts wrote:

Grant Ingersoll wrote:
I will try to take a crack at these, but not sure I know exactlywhat you are looking for, so maybe others can chime in too.At any rate, MultiSearcher has been around a lot longer (2001versus 2004, or at least that is what the changelog seems toindicate) and it works over Searchables, includingRemoteSearchable. So you could use it to combine results fromremote searches as well, MultiReader can only work overIndexReaders and I am not aware of any way that it can do remoteindex reading, so there are different viable use cases for the two.
Really I just want to know the fastest mechanism for searching.Since I don't use a RemoteSearcher, it sounds like using anIndexSearcher on a MultiReader is the way to go.
2. Why doesn't the MultiReader implement the rather nice methodsthat the MultiSearcher has (I'm thinking specifically ofsubSearcher(int) and subDoc(int))?
I suppose subDoc might make sense, but subSearcher does not for aReader. Perhaps the private readerIndex() method on MultiReaderis something you are interested in? Is that getting at what youwant? Maybe you can submit a patch that makes readerIndex publicif that is what you are interested in?
Obviously the methods would have to be appropriately named :). Itsounds like some development work will have to be done on thisthen. I have no problem doing it myself and submitting a patch (Ican pick up this discussion on the developer list when I havetime), but for now is it safe to assume that if I have the numberof documents per IndexReader and the order of the readers that Ican calculate the "real" IndexReader and the "real" docid for thatsub-IndexReader? I realize I might not be very clear, so lets seeif I can re-state my example more clearly in psedo-code (apologizein advance):

Doesn't MultiReader do this already in the readerIndex() method? Ithas to figure out which IndexReader the document is in in order toretrieve it in the first place. This is done in readerIndex(int).Unless I still am not understanding what you mean :-). Your codebelow looks a lot like what is in readerIndex() though, right?


IndexReader r1 (size = 100 documents)
IndexReader r2 (size = 50 documents)
IndexReader r3 (size = 75 documents)

IndexReader[] readers = new IndexReader[] { r1, r2, r3 }
MultiReader mr = new MultiReader(readers)

// get docid in MultiReader
int docid = magicFindDocumentFunc(mr);

for (IndexReader r : readers) {
  if (docid > r.numDocs()) {
    docid -= r.numDocs()
  }
  else {
    // r is the IndexReader that the desired Document
    // docid's current value is lucene id of that Document within r
  }
}

I know I can get the Document straight from the MultiReader but inmy case I need to know which exact IndexReader object the Documentis really coming from.


Thanks in advance for any help,
Kirk

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


--------------------------
Grant Ingersoll
Center for Natural Language Processing
http://www.cnlp.org

Read the Lucene Java FAQ at http://wiki.apache.org/jakarta-lucene/LuceneFAQ




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: MultiSearcher vs MultiReader

Reply via email to