Theoretically, a commit alone should have negligible effect on the slave, 
because of the same aspect of Solr architecture that makes too frequent commits 
problematic --- an existing Searcher continues to serve requests off the old 
version of the index, until the new commit (plus all it's warming) is complete, 
at which point the newly warmed Searcher switches into action. 

So long as there's enough RAM available for both operations, and so long as 
there's enough CPU available so the committing and warming of the new stuff 
doesn't starve things out. (this is where the 'too frequent commit' problem 
comes in, when you get so many overlapping commits such that you run out of RAM 
and/or CPU)

However, this same 'theoretical' logic could be used to argue that you should 
be able to commit directly to the 'slave' without any replication at all with 
no performance indications, which doesn't seem to match actually observed 
results. So maybe it should be taken with a grain of salt, and investigated 
empirically. For that matter, it has seemed to me that even in the master-slave 
setup that I use, while the commit is going on there is SOME performance 
implication, although I haven't benchmarked it well, just impression. But it 
hasn't been a disastrous one, and it's a relatively short timespan, in the 
replication scenario.  

Running master and slave on the very same server (one with a whole bunch of 
cores and plenty of RAM), there hasn't seemed to me to be any performance 
implications on searching the slave while 'add'ing to the master (in a 
completely seperate java container). Only when actually doing the replication 
pull (and it's inherent commit to slave). 
________________________________________
From: kenf_nc [ken.fos...@realestate.com]
Sent: Wednesday, May 11, 2011 9:46 AM
To: solr-user@lucene.apache.org
Subject: Re: how to do offline adding/updating index

My understanding is that the Master has done all the indexing, that
replication is a series of file copies to a temp directory, then a move and
commit. The slave only gets hit with the effects of a commit, so whatever
warming queries are in place, and the caches get reset. Doing too many
commits too often is a problem in any situation with Solr and I wouldn't
recommend it here. However, the original question implied commits would
occur approximately once an hour, that is easily within the capabilities of
the system. Fine tuning of warming queries should minimize any performance
impact. Any effects should also be a relatively linear constant, they should
not be wildly affected by the size of the update or the number of documents.
Warming query results may be slightly different with new documents, but on
the other hand, your new documents are now in cache ready for fast search,
so a reasonable trade off.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-to-do-offline-adding-updating-index-tp2923035p2927336.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to