Depending on your architecture, why not index the same data into two machines? 
One will be your prod another your backup?

Thanks.
Alex.

 

 

 

-----Original Message-----
From: Upayavira <u...@odoko.co.uk>
To: solr-user <solr-user@lucene.apache.org>
Sent: Thu, Dec 20, 2012 11:51 am
Subject: Re: Pause and resume indexing on SolR 4 for backups


You're saying that there's no chance to catch it in the middle of
writing the segments file?

Having said that, the segments file is pretty small, so the chance would
be pretty slim.

Upayavira

On Thu, Dec 20, 2012, at 06:45 PM, Lance Norskog wrote:
> To be clear: 1) is fine. Lucene index updates are carefully sequenced so 
> that the index is never in a bogus state. All data files are written and 
> flushed to disk, then the segments.* files are written that match the 
> data files. You can capture the files with a set of hard links to create 
> a backup.
> 
> The CheckIndex program will verify the index backup.
> java -cp yourcopy/lucene-core-SOMETHING.jar 
> org.apache.lucene.index.CheckIndex collection/data/index
> 
> lucene-core-SOMETHING.jar is usually in the solr-webapp directory where 
> Solr is unpacked.
> 
> On 12/20/2012 02:16 AM, Andy D'Arcy Jewell wrote:
> > Hi all.
> >
> > Can anyone advise me of a way to pause and resume SolR 4 so I can 
> > perform a backup? I need to be able to revert to a usable (though not 
> > necessarily complete) index after a crash or other "disaster" more 
> > quickly than a re-index operation would yield.
> >
> > I can't yet afford the "extravagance" of a separate SolR replica just 
> > for backups, and I'm not sure if I'll ever have the luxury. I'm 
> > currently running with just one node, be we are not yet live.
> >
> > I can think of the following ways to do this, each with various 
> > downsides:
> >
> > 1) Just backup the existing index files whilst indexing continues
> >     + Easy
> >     + Fast
> >     - Incomplete
> >     - Potential for corruption? (e.g. partial files)
> >
> > 2) Stop/Start Tomcat
> >     + Easy
> >     - Very slow and I/O, CPU intensive
> >     - Client gets errors when trying to connect
> >
> > 3) Block/unblock SolR port with IpTables
> >     + Fast
> >     - Client gets errors when trying to connect
> >     - Have to wait for existing transactions to complete (not sure 
> > how, maybe watch socket FD's in /proc)
> >
> > 4) Pause/Restart SolR service
> >     + Fast ? (hopefully)
> >     - Client gets errors when trying to connect
> >
> > In any event, the web app will have to gracefully handle 
> > unavailability of SolR, probably by displaying a "down for 
> > maintenance" message, but this should preferably be only a very short 
> > amount of time.
> >
> > Can anyone comment on my proposed solutions above, or provide any 
> > additional ones?
> >
> > Thanks for any input you can provide!
> >
> > -Andy
> >
> 

 

Reply via email to