Thanks I will give this a try, seems like it should work for my case. -----Original Message----- From: Uwe Schindler [mailto:u...@thetaphi.de] Sent: Tuesday, August 30, 2011 3:53 PM To: java-user@lucene.apache.org Subject: RE: No subsearcher in Lucene 3.3?
Hi, Use ReaderUtil from o.a.l.util package that does the recursive traversal of the reader tree. It has methods to solve this problems. You can cache the int[] start array that contains the starting document ids for each subreader. This makes it possible to use standard TopDocs based search without Collectors (which should not be required for your case) to remap the document ids. As for this issue you are not interested in stepping recursively into the reader tree to the lowest level (as non-optimized subindexes will also expand to multiple readers), so the only thing you would like to know is: on which direct subreader of MultiReader you are interested. For a quick lookup, an approach might be to iterate *once* before search over the direct subreaders of the MultiReader (without recursion), and sum up the maxDoc() (not numDocs!) return values. For each subreader (starting with 0) put the sum into a TreeMap (!!!) with the target index name or whatever you need to identify the subreader. You can then lookup the docid from the TopDocs object using TreeMap.floorEntry(docId).getValue() (Java 6 only). Uwe ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -----Original Message----- > From: Devon H. O'Dell [mailto:devon.od...@gmail.com] > Sent: Tuesday, August 30, 2011 8:04 PM > To: java-user@lucene.apache.org > Subject: Re: No subsearcher in Lucene 3.3? > > 2011/8/30 Joe MA <mrj...@comcast.net>: > > When searching a single collection, no problem. But if I want to > > search the > two collections at the same time, I need to know which collection the > hit came > from so I can retrieve the base_path from the database. These > base_paths can > be different. As mentioned, this was trivial in Lucene 1.x and 2.x as > I just > grabbed the subsearcher from the result, which would for example > return a 1 > or 2 indicating which of the two collections the result came from. > Then I can > build the path to the file. In other words, subsearcher gave me the foreign key > I needed to map to additional external information associated with > each index > during a multisearch. That is now gone in Lucene 3.3. > > You could use the suggestion I made of doing the loop over the > IndexReader subReaders (recursively until you get to the > SegmentReaders) and use a HashMap<SegmentReader, String> (or similar > container structure) to associate the segments to a path. It sounds > like your > application doesn't reopen indexes with much frequency, which is good: > you will need to regenerate this map any time you reopen your indexes. > > When collector.setNextReader is called, you can simply get (at that > point) the String associated with the particular SegmentReader you're working > with. Then, every time Collector.collect is called, you can tack that > on to > whatever data structure you're using to get at your documents. It > doesn't have > to be high memory overhead if you make sure the strings are interned. > > Perhaps Uwe or other Lucene devs have better ideas for approaching > this; they > often do :) > > --dho > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org