Thanks for the replies. Here is why I need the subreader (or subsearcher in earlier Lucene versions):
I have multiple collections of documents, say broken out by years (it's more complex than this, but this illustrates the use case): Collection1 >>> D:/some folder/2009/*.pdf (lots of PDF files) Collection2 >>> D:/another folder/2010/*.pdf (lots of different PDF files) And so forth. So in the example above, I would have two indicies, one for each year. When I index, I store the *relative* path of each document as a field. For example, 'link:2009/file1.pdf' or 'link2010/file1.pdf' etc . I do not store the full path to the files in the index. This has a huge advantage because we can move the documents to another file system or server or path without rebuilding the index. I stored the required base path to the documents in each collection in a database, external to the collection. For example, in the above example, Collection1 would have a base path of "D:/some folder/". Therefore, to actually access a document referenced in a collection, you would concat base_path retrieved from the database to the "link" field retrieved from the collection. I would think this is a very common approach. When searching a single collection, no problem. But if I want to search the two collections at the same time, I need to know which collection the hit came from so I can retrieve the base_path from the database. These base_paths can be different. As mentioned, this was trivial in Lucene 1.x and 2.x as I just grabbed the subsearcher from the result, which would for example return a 1 or 2 indicating which of the two collections the result came from. Then I can build the path to the file. In other words, subsearcher gave me the foreign key I needed to map to additional external information associated with each index during a multisearch. That is now gone in Lucene 3.3. I guess a real simple solution is just to store a new field with each document uniquely identifying which collection. So in the example above, I could create a new field "foreign_key_index" for each document which would be "Collection1" or "Collection2" respectively. This would surely work, but it would break backwards compatibility of my system and would require me to rebuild every collection. Also seems pretty extensive for something so simple. If there is another way to do this, please advise. Thanks in advance and much appreciated. - JMA -----Original Message----- From: Uwe Schindler [mailto:u...@thetaphi.de] Sent: Monday, August 29, 2011 8:05 PM To: java-user@lucene.apache.org Subject: RE: No subsearcher in Lucene 3.3? Why do you need to know the subreader? If you want to get the document's stored fields, use the MultiReader. If you really want to know the subreader, use this: http://lucene.apache.org/java/3_3_0/api/core/org/apache/lucene/util/ReaderUtil.html#subReader(int, org.apache.lucene.index.IndexReader) But this is "somewhat slow", so don’t use in inner loops. Devon suggested: > If I'm understanding your question correctly, in the Collector, you are told > which IndexReader you are working with when the setNextReader method is > called. Hopefully that helps. This does not work as expected, because the Collector gets the lowest level readers, which are in fact sub-sub-readers (as each single IndexReader contains itself of more "SegmentReaders", unless you have optimized sub-indexes). Uwe ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -----Original Message----- > From: Joseph MarkAnthony [mailto:mrj...@comcast.net] > Sent: Monday, August 29, 2011 8:54 PM > To: java-user@lucene.apache.org > Subject: No subsearcher in Lucene 3.3? > > Greetings, > In the past (Lucene version 2.x) I successfully used > MultiSearcher.subsearcher() to identify the searchable within a > MultiSearcher to which a hit belonged. > > In moving to Lucene 3.3, MultiSearcher is now deprecated, and I am > trying to create a standard IndexSearcher over a MultiReader. I > haven't gotten this to work yet but it appears to be the correct > approach. However, I cannot find any corresponding "subsearcher" > method that could identify which subreader is the one that finds the hit. > > For example, it used to be straightforward: > > Create a MultiSearcher over several Searchables, and call > MultiSearcher.subsearcher to get the searchable that holds each search hit. > > Now, I am creating an IndexSearcher over a MultiReader, which is created over > an array of IndexReaders. So when I get a hit, what's the best way to > determine which of the several subReaders the hit came from? > > Thanks in advance, > JMA > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org