Slightly off-topic. Robert -- you may want to look at SOLR-561 -- Solr replication by Solr (for windows also) which is under development.
https://issues.apache.org/jira/browse/SOLR-561 On Thu, Aug 28, 2008 at 7:39 PM, Robert Stewart <[EMAIL PROTECTED] > wrote: > We don't use Solr, since we run on Windows <sigh>;(</sigh>, but we did > implement very similar snapshot replication. We have 2 master index servers > building indexes, partitioned by document. Every 1 minute, we stop index > writer, create a local snapshot (on the master server), in directory named > YYYYMMDDHHMMSS for current timestamp. Then each query server has a > background thread which periodically looks in remote directories on master > server for new snapshot directory. If it finds one, it copies the new > snapshot locally to the query server, using the following algorithm: > > 1. Make a local copy of existing local snapshot: > a. Copy all "changeable" files (segments file, etc.) > b. Create NTFS "hard-links" for all other files (index files) > 2. Copy any new files in new remote index which do not already exist in > local snapshot (since Lucene does not every modify existing index files, > only new files we need to copy (and new segments file). > 3. Delete any files which no longer exist (only deletes local hard-link, > not actual file in current snapshot). > 4. Open index reader on new local snapshot, and run some "warming" queries. > 5. Switch current index reader object to new index reader object so > searches go against new local snapshot. > > Step 1 above is also used on master index server when making new local > snapshots. > > Also, note that we don't use rsync. You do not need it. You only need to > make hard-links, and always copy any "changeable" files, such as "segments" > file. Lucene does not modify index files, only creates new ones (and > deletes old ones after a merge/optimization). > > We use following settings for index writer: > > This gives many segments but search is still very fast, and total MB of new > files copied for each snapshot is relatively small. > > MergeFactor = 2 > MaxBufferedDocs = 10 > MaxMergeDocs = 1,000,000 > > Currently we have about 25 million documents in the master index. > > -----Original Message----- > From: Bill Au [mailto:[EMAIL PROTECTED] > Sent: Thursday, August 28, 2008 8:22 AM > To: java-user@lucene.apache.org > Subject: Re: Replicating Lucene Index with out SOLR > > The snapinstaller script invokes the commit command to trigger Solr to do a > commit, which open a new index reader and then auto-warm the caches. You > will need to replace that with your own code to do the same for your Lucene > index. > > On Thu, Aug 28, 2008 at 1:47 AM, rahul_k123 <[EMAIL PROTECTED]> > wrote: > > > > > Can i make use of solr scripts for this purpose. > > > > > > The snapinstaller runs on the slave after a snapshot has been pulled from > > the master. This signals the local Solr server to open a new index > reader, > > then auto-warming of the cache(s) begins (in the new reader), while other > > requests continue to be served by the original index reader. > > > > How can i achieve the above in my case?? > > > > > > Otis Gospodnetic wrote: > > > > > > You don't need to copy the whole index every time if you do incremental > > > indexing/updates and don't optimize the index before copying. If you > use > > > rsync for copying the index, only the new/modified files be copied. > This > > > is what Solr replication scripts do, too. > > > > > > Otis > > > -- > > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > > > > > > > > > ----- Original Message ---- > > >> From: rahul_k123 <[EMAIL PROTECTED]> > > >> To: [EMAIL PROTECTED] > > >> Sent: Wednesday, August 27, 2008 11:36:07 PM > > >> Subject: Re: Replicating Lucene Index with out SOLR > > >> > > >> > > >> Currently we index every certain amount of time on A. > > >> > > >> -copy the index > > >> Copying the whole index everytime ? > > >> > > >> Currently i am investigating how i can make use of SOLR replication > > >> scripts > > >> to achive this. > > >> > > >> > > >> Is there anyone who did this with out SOLR before? > > >> > > >> > > >> Thanks > > >> > > >> > > >> > > >> Otis Gospodnetic wrote: > > >> > > > >> > Hi, > > >> > > > >> > You may want to ask on the java-user list (more subscribers), which > > I'm > > >> > CC-ing, so we can continue discussion there. > > >> > I think you will have to implement your own logic that runs on A and > > >> does > > >> > something like this: > > >> > > > >> > - stop adding new docs > > >> > - call commit on the IndexWriter > > >> > > > >> > - copy the index > > >> > - resume indexing > > >> > > > >> > Otis > > >> > -- > > >> > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > >> > > > >> > > > >> > > > >> > ----- Original Message ---- > > >> >> From: rahul_k123 > > >> >> To: [EMAIL PROTECTED] > > >> >> Sent: Thursday, August 28, 2008 1:34:41 AM > > >> >> Subject: Replicating Lucene Index with out SOLR > > >> >> > > >> >> > > >> >> I have the following requirement > > >> >> > > >> >> Right now we have multiple indexes serving our web application. > Our > > >> >> indexes > > >> >> are around 30 GB size. > > >> >> > > >> >> We want to replicate the index data so that we can use them to > > >> distribute > > >> >> the search load. > > >> >> > > >> >> This is what we need ideally. > > >> >> > > >> >> A - (supports writes and reads) > > >> >> > > >> >> A1 -Replicated Index (Supports reads) . We want to synchronize > this > > >> >> every 5 > > >> >> mins. > > >> >> > > >> >> > > >> >> > > >> >> Any help is appreciated. We are not using SOLR > > >> >> > > >> >> I also interested in knowing what will be the best way so that I > can > > >> >> scale > > >> >> my application adding more boxes for search if our load increases. > > >> >> > > >> >> Thanks. > > >> >> > > >> >> -- > > >> >> View this message in context: > > >> >> > > >> > > > http://www.nabble.com/Replicating-Lucene-Index-with-out-SOLR-tp19191752p19191752.html > > >> >> Sent from the Lucene - General mailing list archive at Nabble.com. > > >> > > > >> > > > >> > > > >> > > >> -- > > >> View this message in context: > > >> > > > http://www.nabble.com/Replicating-Lucene-Index-with-out-SOLR-tp19191752p19193670.html > > >> Sent from the Lucene - General mailing list archive at Nabble.com. > > > > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > > > > > > > -- > > View this message in context: > > > http://www.nabble.com/Replicating-Lucene-Index-with-out-SOLR-tp19193696p19194576.html > > Sent from the Lucene - Java Users mailing list archive at Nabble.com. > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > -- Regards, Shalin Shekhar Mangar.