Sridhar,

We have been using approach 2 in our production system with good results. We
have separate processes for indexing and searching. The main issue that came
up was in deleting old indexes (see: *http://tinyurl.com/32q8c4*). Most of
our production problems occur during indexing, and we are able to fix these
without having to interrupt searching at all. This has been a real benefit.

Peter


On Thu, Mar 6, 2008 at 5:30 AM, Sridhar Raman <[EMAIL PROTECTED]>
wrote:

> This is my situation.  I have an index, which has a lot of search requests
> coming into it.  I use just a single instance of IndexSearcher to process
> these requests.  At the same time, this index is also getting updated by
> an
> IndexWriter.  And I want these new changes to be reflected _only_ at
> certain
> intervals.  I have thought of a few ways of doing this.  Each has its
> share
> of problems and pluses.  I would be glad if someone can help me in
> figuring
> out the right approach, especially from the performance point of view, as
> the number of documents that will get indexed are pretty large.
>
> Approach 1:
> Have just one copy of the index for both Search & Index.  At time T, when
> I
> need to see the new changes reflected, I close the Searcher, and open it
> again.
> - The re-open of the Searcher might be a bit slow (which I could probably
> solve by using some warm-up threads).
> - Update and Search on the index at the same - will this affect the
> performance?
> - If server crashes before time T, the new Searcher would reflect the
> changes, which is not acceptable.  I want the changes to be reflected only
> at time T.  If server crashes, the index should be the previous T-1 index.
> - Possible problems while optimising the index (as Search is also
> happening).
> + Just one copy of the index being stored.
>
> Approach 2:
> Keep 2 copies of the index - 1 for Search, 1 for Index.  At time T, I just
> switch the Searcher to a copy of index that is being updated.
> - Before I do the switch to the new index, I need to make a copy of it so
> that the updates continue to happen on the other index.  Is there a
> convenient way to make this copy?  Is it efficient?
> - Time taken to create a new Searcher will still be a problem (but this is
> a
> problem in the previous approach as well, and we can live with it).
> + Optimise can happen on an index that is not being read, as a result, its
> resource requirements would be lesser.  And probably even the speed of
> optimisation.
> + Faster search as the index update is happening on a different index.
>
> So, these are the 2 approaches I am contemplating about.  Any pointers
> which
> would be the better approach?
>
> Thanks,
> Sridhar
>

Reply via email to