Re: Keeping a rolling window of indexes around solr

Chris Hostetter Tue, 28 May 2013 16:27:32 -0700

: This is kind of the approach used by elastic search , if I'm not using 
: solrcloud will I be able to use shard aliasing, also with this approach 
: how would replication work, is it even needed?


you haven't said much about hte volume of data you expect to deal with, 
nor have you really explained what types of queries you intend to do -- 
ie: you said you were intersted in a "rolling window of indexes
around n days of data" but you never clarified why you think a 
rolling window of indexes would be useful to you or how exactly you would 
use it.

The primary advantage of sharding by date is if you know that a large 
percentage of your queries are only going to be within a small range of 
time, and therefore you can optimize those requests to only hit the shards 
neccessary to satisfy that small windo of time.

if the majority of requests are going to be across your entire "n days" of 
data, then date based sharding doesn't really help you -- you can just use 
arbitrary (randomized) sharding using periodic deleteByQuery commands to 
purge anything older then N days.  Query the whole collection by default, 
and add a filter query if/when you want to restrict your search to only a 
narrow date range of documents.

this is the same general approach you would use on a non-distributed / 
non-SolrCloud setup if you just had a single collection on a single master 
replicated to some number of slaves for horizontal scaling.


-Hoss

Re: Keeping a rolling window of indexes around solr

Reply via email to