I think the key question here is what's the best way to perform indexing without affecting search performance, or without affecting it much. If you have a batch of documents to index (say a daily batch that takes an hour to index and merge), you'd like to do that on an offline system, and then when ready, bring that index up for searching. but using Lucene's multiple commit points assumes you use the same box for search and indexing doesn't it?

Something like this is what I have in mind (simple 2-server config here):

Box 1 is live and searching
Box 2 is offline and ready to index

loading begins on Box 2...
loading complete on Box 2 ...
commit, optimize

Swap Box 1 and Box 2 ( with a load balancer or application config?)
Box 2 is live and searching
Box 1 is offline and ready to index

To make the best use of your resources, you'd then like to start using Box 1 for searching (until indexing starts up again). Perhaps if your load balancing is clever enough, it could be sensitive to the decreased performance of the indexing box and just send more requests to the other one(s). That's probably ideal.

-Mike S

Under the hood, Lucene can support this by keeping multiple commit
points in the index.

So you'd make a new commit whenever you finish indexing the updates
from each hour, and record that this is the last "searchable" commit.

Then you are free to commit while indexing the next hour's worth of
changes, but these commits are not marked as searchable.

But... this is a low level Lucene capability and I don't know of any
plans for Solr to support multiple commit points in the index.

Mike

http://blog.mikemccandless.com

On Tue, May 10, 2011 at 9:22 AM, vrpar...@gmail.com<vrpar...@gmail.com>  wrote:
Hello all,

indexing with dataimporthandler runs every hour (new records will be added,
some records will be updated) note :large data

requirement is when indexing is in progress, searching (on already indexed
data) should not affect

so should i use multicore-with merge and swap or delta query or any other
way?

Thanks

--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-to-do-offline-adding-updating-index-tp2923035p2923035.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to