On 25/03/15 15:03, Ian Rose wrote:
Per - Wow, 1 trillion documents stored is pretty impressive.  One
clarification: when you say that you have 2 replica per collection on each
machine, what exactly does that mean?  Do you mean that each collection is
sharded into 50 shards, divided evenly over all 25 machines (thus 2 shards
per machine)?
Yes
   Or are some of these slave replicas (e.g. 25x sharding with
1 replica per shard)?
No replication. It does not work very well, at least in 4.4.0. Besides that I am not a big fan of two (or more) machines having to do all the indexing work and making sure to keep synchronized. Use a distributed file-system supporting multiple copies of every piece of data (like HDFS) for HA on data-level. Have only one Solr-node handle the indexing into a particular shard - if this Solr-node breaks down let another Solr-node take over the indexing "leadership" on this shard. Besides the indexing Solr-node several other Solr-nodes can serve data from this shard - just watching the data-folder (can commits) done by the indexing-leader of this particular shard - will give you HA on service-level. That is probably how we are going to do HA - pretty soon. But that is another story

Thanks!
No problem

Reply via email to