I use this on 1.1.0 in my config/elasticsearch.yml index: merge: scheduler: type: concurrent max_thread_count: 4 policy: type: tiered max_merged_segment: 1gb segments_per_tier: 4 max_merge_at_once: 4 max_merge_at_once_explicit: 4
threadpool: merge: type: fixed size: 4 queue_size: 32 Explanation: - use concurrent scheduler and limit it to 4 threads. I find 4 threads being able to keep up with the highest bulk insertion rate I could generate - use tiered policy (the default, it is most flexible in selecting segments to merge) - create segments less than 1gb in a tier (this limits the file size of the segments files, the smaller the files, the faster the merges, but the more files are created) - create 4 segments per tier (do not create segments numbers that are too high per tier) - merge 4 segments at each merge step (this limits the total run time and resource consumption of a segment merge step) - also limit merge for explicit _optimize API call - extend thread pool to 4 merge threads with a maximum of 32 merge operations in the queue (32 should be sufficient to handle outstanding merges) >From time to time, if the number of files get very high (>500) and index is calm (no indexing, no heavy search), I do a manual _optimize. Jörg On Fri, Apr 18, 2014 at 9:01 PM, David Smith <davidksmit...@gmail.com>wrote: > I see that ES switch back to ConcurrentMergeScheduler in 1.1.1 due to it > affecting indexing performance in 1.1.0. > https://github.com/elasticsearch/elasticsearch/issues/5817 > > We're on 1.1.0 and cannot upgrade to 1.1.1 for the time being. Is there a > way to switch it back using the API? I tried the following command, but it > seems to not take. > > curl -i -XPUT localhost:9200/_cluster/settings -d '{ "persistent": { > "index.merge.scheduler.type": > "org.elasticsearch.index.merge.scheduler.ConcurrentMergeSchedulerProvider" > } }' > HTTP/1.1 200 OK > Content-Type: application/json; charset=UTF-8 > Content-Length: 52 > > {"acknowledged":true,"persistent":{},"transient":{}} > > > It does not seem to be set when I try to re-GET it (and no errors in logs > at DEBUG level or above). > > curl -i -XGET localhost:9200/_cluster/settings > HTTP/1.1 200 OK > Content-Type: application/json; charset=UTF-8 > Content-Length: 66 > > {"persistent":{"threadpool":{"bulk":{"size":"8"}}},"transient":{}} > > > Am using the wrong way of specifying the scheduler? I also tried just > specifying ConcurrentMergeSchedulerProvider instead of the full class > name, but that didn't work. > > Any ideas? > David > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to elasticsearch+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/601a831d-2c8e-4615-b816-435a6d4e4d9c%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/601a831d-2c8e-4615-b816-435a6d4e4d9c%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGwnPYyBPYRSPz5c9WGzfH68CHX7gXb7UwmgMbwXdOnMg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.