On Apr 20, 2007, at 3:08 PM, Kirk Roberts wrote:
Grant Ingersoll wrote:
I will try to take a crack at these, but not sure I know exactly
what you are looking for, so maybe others can chime in too.
At any rate, MultiSearcher has been around a lot longer (2001
versus 2004, or at least that is what the changelog seems to
indicate) and it works over Searchables, including
RemoteSearchable. So you could use it to combine results from
remote searches as well, MultiReader can only work over
IndexReaders and I am not aware of any way that it can do remote
index reading, so there are different viable use cases for the two.
Really I just want to know the fastest mechanism for searching.
Since I don't use a RemoteSearcher, it sounds like using an
IndexSearcher on a MultiReader is the way to go.
2. Why doesn't the MultiReader implement the rather nice methods
that the MultiSearcher has (I'm thinking specifically of
subSearcher(int) and subDoc(int))?
I suppose subDoc might make sense, but subSearcher does not for a
Reader. Perhaps the private readerIndex() method on MultiReader
is something you are interested in? Is that getting at what you
want? Maybe you can submit a patch that makes readerIndex public
if that is what you are interested in?
Obviously the methods would have to be appropriately named :). It
sounds like some development work will have to be done on this
then. I have no problem doing it myself and submitting a patch (I
can pick up this discussion on the developer list when I have
time), but for now is it safe to assume that if I have the number
of documents per IndexReader and the order of the readers that I
can calculate the "real" IndexReader and the "real" docid for that
sub-IndexReader? I realize I might not be very clear, so lets see
if I can re-state my example more clearly in psedo-code (apologize
in advance):
Doesn't MultiReader do this already in the readerIndex() method? It
has to figure out which IndexReader the document is in in order to
retrieve it in the first place. This is done in readerIndex(int).
Unless I still am not understanding what you mean :-). Your code
below looks a lot like what is in readerIndex() though, right?
IndexReader r1 (size = 100 documents)
IndexReader r2 (size = 50 documents)
IndexReader r3 (size = 75 documents)
IndexReader[] readers = new IndexReader[] { r1, r2, r3 }
MultiReader mr = new MultiReader(readers)
// get docid in MultiReader
int docid = magicFindDocumentFunc(mr);
for (IndexReader r : readers) {
if (docid > r.numDocs()) {
docid -= r.numDocs()
}
else {
// r is the IndexReader that the desired Document
// docid's current value is lucene id of that Document within r
}
}
I know I can get the Document straight from the MultiReader but in
my case I need to know which exact IndexReader object the Document
is really coming from.
Thanks in advance for any help,
Kirk
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------
Grant Ingersoll
Center for Natural Language Processing
http://www.cnlp.org
Read the Lucene Java FAQ at http://wiki.apache.org/jakarta-lucene/
LuceneFAQ
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]