Hi Taking a backup of the index by doing a naive file copy is not a good approach. As you mentioned, Lucene does background merging and if your application suddenly commits, old segment files may be deleted. Also, your backup will most probably include files that were not committed yet.
Rather, you should use SnapshotDeletionPolicy to take a snapshot of the index, then copy all the files referenced by the snapshot. You can also try the new Replicator module (will be available in Lucene 4.4) to take periodic backups of the index with very few steps required on your end. You can read about it here: http://shaierera.blogspot.com/2013/05/the-replicator.html Shai On Thu, Jun 6, 2013 at 11:14 AM, Daniel Penning <dpenn...@gamona.de> wrote: > I do my backups by creating a new index at the backup target and copying > everything over with IndexWriter#addIndexes(**IndexReader... readers). In > the future i am also planing on using a RateLimitedDirectoryWrapper to > reduce the influence of the running backup on the rest of the system. > > Am 06.06.2013 09:43, schrieb Thomas Matthijs: > > On Thu, Jun 6, 2013 at 7:38 AM, Lance Norskog <goks...@gmail.com> wrote: >> >> The simple answer (that somehow nobody gave) is that you can make a copy >>> of an index directory at any time. Indexes are changed in "generations". >>> The segment* files describe the current generation of files. All active >>> indexing goes on in new files. In a commit, all new files are flushed to >>> disk and then the segment* files change. At any point in this sequence, >>> all >>> of the files in the directory form one consistent index. >>> >>> This isn't like MySQL or other databases where you have to shut down the >>> DB to get a safe copy of the files. >>> >> >> If you just do a naive copy, where it gets a file list first, and then >> copies them, segments can be merged during the copy and deleted by lucene >> resulting in an incomplete backup, that is why you need the snapshot >> policy >> to keep them around until the copy is completed. >> >> If you have very few updates and don't mind risking a broken index, or >> just >> loop rsync till both sides are equal you don't need anything else indeed >> >> > > ------------------------------**------------------------------**--------- > To unsubscribe, e-mail: > java-user-unsubscribe@lucene.**apache.org<java-user-unsubscr...@lucene.apache.org> > For additional commands, e-mail: > java-user-help@lucene.apache.**org<java-user-h...@lucene.apache.org> > >