Re: Shard count and plugin questions

2014-06-10 Thread Mark Walkom
There are a few people in the IRC channel that have done it, however, generally, cross-WAN clusters are not recommended as ES is sensitive to latency. You may be better off using the snapshot/restore process, or another export/import method. Regards, Mark Walkom Infrastructure Engineer Campaign

Re: Shard count and plugin questions

2014-06-10 Thread Todd Nine
Hey Mark, Thanks for this. It seems like our best bet will be to manage indexes the same across all regions, since they're really mirrors. Since our documents are immutable, we'll just queue them up for each region, which will insert or delete them into their index in the region. It's the

Re: Shard count and plugin questions

2014-06-10 Thread Mark Walkom
You could look at using a queuing system, like rabbitmq, where your application drops the data into, then have a logstash instance in each DC that pulls off the queue and pushes into ES. That way you can easily handle the replication of the data to multiple endpoints within rabbitmq. Regards,

Re: Shard count and plugin questions

2014-06-05 Thread Mark Walkom
I haven't heard of a limit to the number of indexes, obviously the more you have the larger the cluster state that needs to be maintained. You might want to look into routing ( http://exploringelasticsearch.com/advanced_techniques.html or

Re: Shard count and plugin questions

2014-06-05 Thread Todd Nine
Thanks for the feedback Mark. I agree with your thoughts on the testing. We plan on doing some testing, find our failure point, and dial that back to some value that allows us to still run the migration. This way, we can get ahead of the problem. Since a re-index would actually introduce more

Re: Shard count and plugin questions

2014-06-05 Thread joergpra...@gmail.com
The knapsack plugin does not come with a downtime. You can increase shards on the fly by copying an index over to another index (even on another cluster). The index should be write disabled during copy though. Increasing replica level is a very simple command, no index copy required. It seems

Re: Shard count and plugin questions

2014-06-05 Thread Todd Nine
Hey Jörg, Thank you for your response. A few questions/points. In our use cases, the inability to write or read is considered a downtime. Therefore, I cannot disable writes during expansion. Your alias points raise some interesting research I need to do, and I have a few follow up questions.

Re: Shard count and plugin questions

2014-06-05 Thread joergpra...@gmail.com
Thanks for raising the questions, I will come back later in more detail. Just a quick note, the idea about shards scale write and replica scale read is correct, but Elasticsearch is also elastic which means it scales out, by adding node hardware. The shard/replica scale pattern finds its limits

Re: Shard count and plugin questions

2014-06-05 Thread Todd Nine
Hey Jorg, Thanks for the reply. We're using Cassandra heavily in production, I'm very familiar with the scale out out concepts. What we've seen in all our distributed systems is that at some point, you reach a saturation of your capacity for a single node. In the case of ES, to me that would

Re: Shard count and plugin questions

2014-06-05 Thread joergpra...@gmail.com
Yes, routing is very powerful. The general use case is to introduce a mapping to a large number of shards so you can store parts of data all at the same shard which is good for locality concepts. For example, combined with index alias working on filter terms, you can create one big concrete index,

Re: Shard count and plugin questions

2014-06-04 Thread Mark Walkom
1) The answer is - it depends. You want to setup a test system with indicative specs, and then throw some sample data at it until things start to break. However this may help https://www.found.no/foundation/sizing-elasticsearch/ 2) https://github.com/jprante/elasticsearch-knapsack might do what

Re: Shard count and plugin questions

2014-06-04 Thread Todd Nine
Thanks for the answers Mark. See inline. On Wed, Jun 4, 2014 at 3:51 PM, Mark Walkom ma...@campaignmonitor.com wrote: 1) The answer is - it depends. You want to setup a test system with indicative specs, and then throw some sample data at it until things start to break. However this may