Yes, I think you pinpointed what I see over and over with Solr. The two desires pull in opposite directions. I think Jason Rutherglen is very keen to start talking about Lucene clusters and index replication in such clusters without using the classic master/slave approach.
Jason, want to start a thread on java-dev? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch ----- Original Message ---- > From: mark harwood <[EMAIL PROTECTED]> > To: java-user@lucene.apache.org > Sent: Thursday, August 28, 2008 6:21:19 AM > Subject: Re: Replicating Lucene Index with out SOLR > > >> You don't need to copy the whole index every time > >> if you do incremental indexing/updates and don't optimize the index > > > But at 5 minute intervals for replication does this not quickly lead to a > very > fragmented index? > > It seems there is a fundamental conflict when building replication systems > based > entirely on the lucene file format: > * In the interests of good search performance the index should ideally be a > small number of large files (which is what mergepolicy/optimize are all about > maintaining) > * However, in the interest of minimising replication network traffic, the > ideal > is a large number of small files. > > I've previously built replication systems which rely on each server pulling > deltas in the form of insert/update/delete records from a database and using > IndexWriter locally on each server to apply these sets of changes. Obviously > this duplicates the analyzing/indexing effort across replicas but does mean > the > content being transferred is not restricted by the design of the Lucene file > format and therefore uses minimal network traffic and places no restrictions > on > the IndexWriter merge policies I may choose to use to optimise search speed. > > Keen to explore the pros and cons of these different replication schemes. > > Cheers, > Mark > > > > --- On Thu, 28/8/08, rahul_k123 wrote: > > > From: rahul_k123 > > Subject: Re: Replicating Lucene Index with out SOLR > > To: java-user@lucene.apache.org > > Date: Thursday, 28 August, 2008, 6:47 AM > > Can i make use of solr scripts for this purpose. > > > > > > The snapinstaller runs on the slave after a snapshot has > > been pulled from > > the master. This signals the local Solr server to open a > > new index reader, > > then auto-warming of the cache(s) begins (in the new > > reader), while other > > requests continue to be served by the original index > > reader. > > > > How can i achieve the above in my case?? > > > > > > Otis Gospodnetic wrote: > > > > > > You don't need to copy the whole index every time > > if you do incremental > > > indexing/updates and don't optimize the index > > before copying. If you use > > > rsync for copying the index, only the new/modified > > files be copied. This > > > is what Solr replication scripts do, too. > > > > > > Otis > > > -- > > > Sematext -- http://sematext.com/ -- Lucene - Solr - > > Nutch > > > > > > > > > > > > ----- Original Message ---- > > >> From: rahul_k123 > > >> To: [EMAIL PROTECTED] > > >> Sent: Wednesday, August 27, 2008 11:36:07 PM > > >> Subject: Re: Replicating Lucene Index with out > > SOLR > > >> > > >> > > >> Currently we index every certain amount of time on > > A. > > >> > > >> -copy the index > > >> Copying the whole index everytime ? > > >> > > >> Currently i am investigating how i can make use of > > SOLR replication > > >> scripts > > >> to achive this. > > >> > > >> > > >> Is there anyone who did this with out SOLR before? > > >> > > >> > > >> Thanks > > >> > > >> > > >> > > >> Otis Gospodnetic wrote: > > >> > > > >> > Hi, > > >> > > > >> > You may want to ask on the java-user list > > (more subscribers), which I'm > > >> > CC-ing, so we can continue discussion there. > > >> > I think you will have to implement your own > > logic that runs on A and > > >> does > > >> > something like this: > > >> > > > >> > - stop adding new docs > > >> > - call commit on the IndexWriter > > >> > > > >> > - copy the index > > >> > - resume indexing > > >> > > > >> > Otis > > >> > -- > > >> > Sematext -- http://sematext.com/ -- Lucene - > > Solr - Nutch > > >> > > > >> > > > >> > > > >> > ----- Original Message ---- > > >> >> From: rahul_k123 > > >> >> To: [EMAIL PROTECTED] > > >> >> Sent: Thursday, August 28, 2008 1:34:41 > > AM > > >> >> Subject: Replicating Lucene Index with > > out SOLR > > >> >> > > >> >> > > >> >> I have the following requirement > > >> >> > > >> >> Right now we have multiple indexes > > serving our web application. Our > > >> >> indexes > > >> >> are around 30 GB size. > > >> >> > > >> >> We want to replicate the index data so > > that we can use them to > > >> distribute > > >> >> the search load. > > >> >> > > >> >> This is what we need ideally. > > >> >> > > >> >> A – (supports writes and reads) > > >> >> > > >> >> A1 –Replicated Index (Supports reads) > > . We want to synchronize this > > >> >> every 5 > > >> >> mins. > > >> >> > > >> >> > > >> >> > > >> >> Any help is appreciated. We are not > > using SOLR > > >> >> > > >> >> I also interested in knowing what will be > > the best way so that I can > > >> >> scale > > >> >> my application adding more boxes for > > search if our load increases. > > >> >> > > >> >> Thanks. > > >> >> > > >> >> -- > > >> >> View this message in context: > > >> >> > > >> > > > http://www.nabble.com/Replicating-Lucene-Index-with-out-SOLR-tp19191752p19191752.html > > >> >> Sent from the Lucene - General mailing > > list archive at Nabble.com. > > >> > > > >> > > > >> > > > >> > > >> -- > > >> View this message in context: > > >> > > > http://www.nabble.com/Replicating-Lucene-Index-with-out-SOLR-tp19191752p19193670.html > > >> Sent from the Lucene - General mailing list > > archive at Nabble.com. > > > > > > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: > > [EMAIL PROTECTED] > > > For additional commands, e-mail: > > [EMAIL PROTECTED] > > > > > > > > > > > > > -- > > View this message in context: > > > http://www.nabble.com/Replicating-Lucene-Index-with-out-SOLR-tp19193696p19194576.html > > Sent from the Lucene - Java Users mailing list archive at > > Nabble.com. > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: > > [EMAIL PROTECTED] > > For additional commands, e-mail: > > [EMAIL PROTECTED] > > > Send instant messages to your online friends http://uk.messenger.yahoo.com > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]