Also, it's in general not good to check for IR reopen on every search request: this could be way too often if you suddenly hit high search load, and if it's a big reopen (a large segment merge just completed) you slow down that one unlucky search too much; it's better to have a background thread that does so periodically.
The SearcherManager class simplifies this for you ... Mike McCandless http://blog.mikemccandless.com On Fri, Jul 12, 2013 at 3:55 PM, Sriram Sankar <san...@gmail.com> wrote: > Thanks! > > > On Tue, Jul 9, 2013 at 2:13 PM, Adrien Grand <jpou...@gmail.com> wrote: > >> Hi Sriram, >> >> On Tue, Jul 9, 2013 at 5:06 AM, Sriram Sankar <san...@gmail.com> wrote: >> > I've finally got something running and will send you some performance >> > numbers as promised shortly. In the meanwhile, I've a question regarding >> > the use of real time indexing along with ordering by static rank. Before >> > each search, I do the reopen as follows: >> > >> > public void refresh() throws IOException { >> > DirectoryReader r = DirectoryReader.openIfChanged(reader); >> > if (r != null) { >> > reader.close(); >> > reader = r; >> > this.live = SortingAtomicReader.wrap( >> > new SlowCompositeReaderWrapper(reader), >> > new StaticRankSorter()); >> > } >> > } >> > >> > This works fine. However, I believe the index is resorted everytime I >> > reopen the index. Ideally, it would be nice to do the sort more >> > incrementally each time a new document gets added. I assume that this is >> > not easy - but just in case you have ideas, I'd like to hear them. >> >> I think a good trade-off could be to fully collect the small segments >> that come from incremental updates. Since they are small, collecting >> them will be fast anyway. One the opposite, the bottleneck is likely >> the collection of large segments. This is why we chose to tackle the >> problem of online sorting using a merge policy (SortingMergePolicy). >> Segments are only sorted when merging, meaning that small NRT >> (flushed) segments won't be sorted but large (merged) segments will >> be. >> >> Then computing the top hits is just a matter of computing the best >> hits on every segment and merging them into a single hit list: >> - for flushed segments, you need to fully collect them like Lucene >> does by default, >> - for sorted segments, you can early-terminate collection on a >> per-segment basis when enough matchs have been collected. >> >> -- >> Adrien >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org