Re: Programmatic restructuring of a Solr cloud

Jan Høydahl Thu, 05 May 2011 06:07:36 -0700

Hi,

One approach if you're using Amazon is using BeanStalk


* Create one master with 12 cores, named "jan", "feb", "mar" etc
* Every month, you clear the current month index and switch indexing to it
  You will only have one master, because you're only indexing to one month at a 
time
* For each of the 12 months, setup an Amazon BeanStalk instance with a Solr 
replica pointing to its master
  This way, Amazon will spin off replicas as needed
  NOTE: Your replica could still be located at /solr/select even if it 
replicates from /solr/may/replication
* You only query the replicas, and the client will control whether to query one 
or more shards
  
&shards=jan.elasticbeanstalk.com/solr,feb.elasticbeanstalk.com/solr,mar.elasticbeanstalk.com/solr

After this is setup, you have 0 config to worry about :)

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

On 5. mai 2011, at 14.03, Sergey Sazonov wrote:

> Dear Solr Experts,
> 
> First of all, I would like to thank you for your patience when answering 
> questions of those who are less experienced.
> 
> And now to the main topic: I would like to learn whether it is possible to 
> restructure a Solr cloud programmatically.
> 
> Let me describe the system we are designing to make the requirements clear. 
> The indexed documents are certain log entries. We are planning to shard them 
> by month, and only keep the last 12 months in the index. We are going to 
> replicate each shard across several servers.
> 
> Now, the user is always required to search within a single month (= shard). 
> Most importantly, we expect an absolute majority of the requests to query the 
> current month, with only a minor load on the previous months. In order to 
> utilise the cluster most efficiently, we would like a majority of the servers 
> to contain replicas of the current month data, and have only one or two 
> servers per older month. To this end, we are planning to have a set of slaves 
> that "migrate" from master to master, depending on which master holds the 
> data for the current month. When a new month starts, those slaves have to be 
> reconfigured to hold the new shard and to replicate from the new master 
> (their old master now holding the data for the previous month).
> 
> Since this operation has to be done every month, we are naturally considering 
> automating it. So my question is whether anyone has faced a similar problem 
> before, and what is the best way to solve it. We are not committed to any 
> solution, or even architecture, so feel free to propose different solutions. 
> The only requirement is that a majority of the servers should be able to 
> serve requests to the current month at any given moment.
> 
> Thank you in advance for your answers.
> 
> Best regards,
> Sergey Sazonov.

Re: Programmatic restructuring of a Solr cloud

Reply via email to