Hi,

I'd like to get some architectural advices concerning the setup of a solr
(v1.4) platform in a production environment. 
I'll first describe my targeted architecture and then ask the questions
related to that environment.


Here's briefly what I achieved so far:

I've already setup an environment which serves as a proof of concept. This
environment is composed of a master instance on one host, 
and a slave instance on a second host. The slave handles 2 solr cores. 
In the final version of the architecture I would add up one ore more SLAVE
nodes depending on the request load.  

                                                  request   
                                                       |
                                                       V  
[  MASTER [core]  ] ------- [SLAVE [core1] <--swap-->[core2]  ]
                   |
                   v
           [index backup]

The goal of this architecture is:
* Isolate indexing from requesting
* Enable index replication from master to slave
* Control the swap between newly replicated index  (use of dual core per
Slave ) 

Here's how the whole platform works when we need to renew the index (on the
slaves)
1- backup index files on master using solr backup capability (a backup is
always welcome)
2- launch index creation (I'm using the delta indexing capabilities in order
to limit the index generation time)
3- trigger replication from master core to slave core2 based on solr
capabilities too
4- trigger swap between core 1 and core2
5- At this point Slave index has been renewed ... we can revert back to the
previous index if there was any issues with the new one.

As this is aimed to be a production environment, redondancy is one of the
key elements, meaning that will double (or more) the front solr 
instances. If slave instances are not in the same network as the Master
instance, our strategy will probably be to set up one of the slaves 
as a relay.

That said, here are my questions:

1 /  I'd like to have insight about issues that may happen with that kind of
architecture?

2 / My first concern is about the size of the index that would need to be
replicated. We need to perform indexing all day long (every 5min) and
replicate as soon as the index is built.
As far as I know, replication copies over all the index files. I think that
there can not be delta replication (only replicating what changed). That's
my assumption. 
But, is there any way to make a delta replication if that make any sense?

3 / How can I improve this architecture based on your own experience?
Ex: Shall I use different network interface for solr commands and requests?

Thank you for sharing.
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Architecture-discussion-tp825708p825708.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to