Awesome, ok, thank you. Is the logic behind not allowing storage on master nodes to both: Take advantage of a system with limited storage resources and Have a dedicated results aggregator/search handler?
I can imagine if I had a particularly badly written gnarly search, trying to deal with the results on a master and a querying the results at the same time could be bad. So in a 16 node cluster you'd want to have 9 nodes allowed to be masters, (n/2)+1? Thanks again! Josh On Friday, March 21, 2014 3:20:24 PM UTC-7, Mark Walkom wrote: > > A couple of things; > > 1. You should have n/2+1 masters in your cluster, where n = number of > nodes. This helps prevent split brain situations and is best practise. > 2. Your master nodes can store data, this way you don't need to add > more nodes to fulfil the above. > > Your indexing scenario is correct. > For searching, replica's and primaries can be queried. > For both - Adding more masters adds redundancy as per the first two > points. Adding more search nodes won't do much though other than reduce the > load on your masters (unless someone else can add anything I don't know :p). > > And for your final question, yes that is correct. > > To give you an idea of practical application, we don't use search nodes > but have 3 non-data masters that handle all queries, and a bunch of data > only nodes for storing everything. > > Regards, > Mark Walkom > > Infrastructure Engineer > Campaign Monitor > email: ma...@campaignmonitor.com <javascript:> > web: www.campaignmonitor.com > > > On 22 March 2014 08:25, Josh Harrison <hij...@gmail.com <javascript:>>wrote: > >> I'm trying to build a basic understanding of how indexing and searching >> works, hopefully someone can either point me to good resources or explain! >> I'm trying to figure out what having multiple "coordinator" nodes as >> defined in the elasticsearch.yml would do, and what having multiple "search >> load balancer" nodes would do. Both in the context of indexing and >> searching. >> Is there a functional difference between a "coordinator" node and a >> "search load balancer" node, beyond the fact that a "search load balancer" >> node can't be elected master? >> >> >> Say I have a 4 node cluster. There's a master only "coordinator" node, >> that doesn't store data, named "master". >> node.master: true >> node.data: false >> >> There are three data only nodes, "A", "B" and "C" >> node.master: false >> node.date: true >> >> I have an index "test" with two shards and one replica. Primary shard 0 >> lives on A, primary shard 1 lives on C, replica shard 0 lives on B, replica >> shard 1 lives on A. >> >> I send the command >> curl -XPOST http://master:9200/test/test -d '{"foo":"bar"}' >> >> A connection is made to master, and the data is sent to master to be >> indexed. Master randomly decides to place this document in shard 1, so it >> gets sent to the primary shard 1 on C and replica shard 1 on B, right? This >> is where routing can come in, I can say that that document really should go >> to shard 0 because I said so. >> >> So this is a fairly simple scenario, assuming I'm correct. >> >> What benefit do I get to indexing when I add more "coordinator" nodes? >> node.master: true >> node.data: false >> >> What about if I add "search load balancer" nodes? >> node.master: false >> node.data: false >> >> >> >> How about on the searching side of things? >> I send a search to master, >> curl -XPOST http://master:9200/test/test/_search -d >> '{"query":{"match_all":{}}}' >> >> Master sends these queries off to A, B and C, who each generate their own >> results and return them to master. Each data node queries all the relevant >> shards that are present locally and then combines those results for >> delivery to master. Do only primary shards get queried, or are replica >> shards queried too? >> Master takes these combined results from all the relevant nodes and >> combines them into the final query response. >> >> Same questions: >> What benefit do I get to searching when I add more nodes that are like >> master? >> node.master: true >> node.data: false >> >> What about if I add "search load balancer" nodes? >> node.master: false >> node.data: false >> >> >> Is the only difference between a >> node.master: true >> node.data: false >> and a >> node.master: false >> node.data: false >> that the node is a candidate to be a master, should it be elected? >> >> -- >> You received this message because you are subscribed to the Google Groups >> "elasticsearch" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to elasticsearc...@googlegroups.com <javascript:>. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/elasticsearch/eaff1d85-1e85-422d-bfba-9a0825ed5da9%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/eaff1d85-1e85-422d-bfba-9a0825ed5da9%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5b45303b-b012-4c3c-9bd7-86cf02d7f937%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.