As noted here -- 
https://groups.google.com/forum/#!searchin/elasticsearch/snapshot$20duration/elasticsearch/bCKenCVFf2o/TFK-Es0wxSwJ
 
-- the time it takes to perform a snapshot increases the more snapshots you 
take.  This eventually can become untenable.  So far, the only solution 
seems to be either: trim snapshots or snapshot into a new repository, 
resetting the performance.


1) When I perform snapshots, I want to snapshot all indices.  However, all 
of my indices are timestamped logstash-style.  The only index that receives 
new documents is todays.  I would think Elasticsearch could optimize for 
this and not look through all the snapshots if the index is older than 
today.  If there was some mechanism to indicate an index was frozen 
(read-only), then snapshotting could be very fast.  Query the 'frozenTime' 
for all indices and only try to update the unfrozen snapshots.  


2) I can sort of solve the above problem by just snapshotting today's 
indices, but then restore is cumbersome.  Say I retained 60 days worth of 
data; that means I'd have to retain 60 days worth of snapshots.  And to do 
the restore, I'd have to restore all 60 snapshots.  The situation gets 
worse if I wanted to snapshot multiple times a day.


3) While not really fixing the crux of the problem, the curator script 
could help here.  Right now, you can only trim snapshots --older-than some 
date.  But what if there was a --thin option.  Say I take snapshots every 
hour; I'm only really interested in that precision of snapshots for the 
past 24 hours.  A backup from a month ago at 12pm is not much different to 
me than a month ago at 1pm.  The proposed --thin option would look 
something like this:

curator snapshot --thin-older-than 1 --retain-copies 1.

This would delete all but the last snapshot for each day.



I'd love to hear thoughts on this and how people are currently solving this 
problem in an automated way.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/11701b07-09da-4643-be18-19aae7764ce9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to