There are a few people in the IRC channel that have done it, however,
generally, cross-WAN clusters are not recommended as ES is sensitive to
latency.
You may be better off using the snapshot/restore process, or another
export/import method.
Regards,
Mark Walkom
Infrastructure Engineer
Campaign
Hey Mark,
Thanks for this. It seems like our best bet will be to manage indexes
the same across all regions, since they're really mirrors. Since our
documents are immutable, we'll just queue them up for each region, which
will insert or delete them into their index in the region. It's the
You could look at using a queuing system, like rabbitmq, where your
application drops the data into, then have a logstash instance in each DC
that pulls off the queue and pushes into ES.
That way you can easily handle the replication of the data to multiple
endpoints within rabbitmq.
Regards,
I haven't heard of a limit to the number of indexes, obviously the more you
have the larger the cluster state that needs to be maintained.
You might want to look into routing (
http://exploringelasticsearch.com/advanced_techniques.html or
Thanks for the feedback Mark.
I agree with your thoughts on the testing. We plan on doing some testing,
find our failure point, and dial that back to some value that allows us to
still run the migration. This way, we can get ahead of the problem. Since
a re-index would actually introduce more
The knapsack plugin does not come with a downtime. You can increase shards
on the fly by copying an index over to another index (even on another
cluster). The index should be write disabled during copy though.
Increasing replica level is a very simple command, no index copy required.
It seems
Hey Jörg,
Thank you for your response. A few questions/points.
In our use cases, the inability to write or read is considered a downtime.
Therefore, I cannot disable writes during expansion. Your alias points
raise
some interesting research I need to do, and I have a few follow up
questions.
Thanks for raising the questions, I will come back later in more detail.
Just a quick note, the idea about shards scale write and replica scale
read is correct, but Elasticsearch is also elastic which means it
scales out, by adding node hardware. The shard/replica scale pattern
finds its limits
Hey Jorg,
Thanks for the reply. We're using Cassandra heavily in production, I'm
very familiar with the scale out out concepts. What we've seen in all our
distributed systems is that at some point, you reach a saturation of your
capacity for a single node. In the case of ES, to me that would
Yes, routing is very powerful. The general use case is to introduce a
mapping to a large number of shards so you can store parts of data all at
the same shard which is good for locality concepts. For example, combined
with index alias working on filter terms, you can create one big concrete
index,
1) The answer is - it depends. You want to setup a test system with
indicative specs, and then throw some sample data at it until things start
to break. However this may help
https://www.found.no/foundation/sizing-elasticsearch/
2) https://github.com/jprante/elasticsearch-knapsack might do what
Thanks for the answers Mark. See inline.
On Wed, Jun 4, 2014 at 3:51 PM, Mark Walkom ma...@campaignmonitor.com
wrote:
1) The answer is - it depends. You want to setup a test system with
indicative specs, and then throw some sample data at it until things start
to break. However this may
12 matches
Mail list logo