A bit late from the OP posted this, not sure if it is still relevant but anyway...
>> Under what circumstances will an ES node evict entries from it's field data cache? We're also deleting documents from the index, can this have an impact? What other things should I be looking it to find a correlation (GC time does not seem to be correlated)? The cache implements an LRU eviction policy: when a cache becomes full, the least recently used data is evicted to make way for new data. http://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules-cache.html more information here http://www.elastic.co/guide/en/elasticsearch/guide/current/_monitoring_individual_nodes.html#_indices_section It's puzzling in your case when you set to 10GB for cache size but per node usage only 3.6GB . Have you use the other api to check the cache if it is also the case? http://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules-shard-query-cache.html#_monitoring_cache_usage There are also a few additional links which might give you hints. http://evertrue.github.io/blog/2014/11/16/3-performance-tuning-tips-for-elasticsearch/ https://github.com/elastic/elasticsearch/issues/3639 http://www.elastic.co/guide/en/elasticsearch/guide/current/_limiting_memory_usage.html Hope it helps. Jason On Tuesday, September 16, 2014 at 10:25:08 PM UTC+8, Philippe Laflamme wrote: > > Sorry for bumping this, but I'm a little stumped here. > > We have some nodes that are evicting fielddata cache entries for seemingly > no reason: > 1) we've set indices.fielddata.cache.size to 10gb > 2) the metrics from the node stats endpoint show that the > indices.fielddata.memory_size_in_bytes never exceeded 3.6GB on any node. > 3) the rate of eviction is normally 0, but goes up above that eventhough > the fielddata cache size is nowhere near 10GB > > Attached is a plot of the max(indices.fielddata.memory_size_in_bytes) (red > line) and sum(indices.fielddata.evictions) (green line) across all nodes in > the cluster. Note that we create a fresh new index every day that replaces > an older one (that explains the change in profile around midnight). > > As you can see, the size (on any given node) never exceeds 3.6GB, yet even > at a lower value (around 2.2GB), some nodes start evicting entries from the > cache. Also, starting around Tue 8AM, the max(field cache size) becomes > erratic and jumps up and down. > > I can't explain this behaviour, especially since we've been operating for > a while at this volume and rate of documents. This was not happening > before. Though it's possible that we're getting a higher volume of data, it > doesn't look substantially different from the past. > > Under what circumstances will an ES node evict entries from it's field > data cache? We're also deleting documents from the index, can this have an > impact? What other things should I be looking it to find a correlation (GC > time does not seem to be correlated)? > > Thanks, > Philippe > > On Friday, September 12, 2014 9:33:16 AM UTC-4, Philippe Laflamme wrote: >> >> Forgot to mention that we're using ES 1.1.1 >> >> On Friday, September 12, 2014 9:21:23 AM UTC-4, Philippe Laflamme wrote: >>> >>> Hi, >>> >>> I have a cluster with nodes configured with a 18G heap. We've noticed a >>> degradation in performance recently after increasing the volume of data >>> we're indexing. >>> >>> I think the issue is due to the field data cache doing eviction. Some >>> nodes are doing lots of them, some aren't doing any. This is explained by >>> our routing strategy which results in non-uniform document distribution. >>> Maybe we can improve this eventually, but in the meantime, I'm trying to >>> understand why the nodes are evicting cached data. >>> >>> The metrics show that the field data cache is only ~1.5GB in size, yet >>> we have this in our elasticsearch.yml: >>> >>> indices.fielddata.cache.size: 10gb >>> >>> Why would a node evict cache entries when it should still have plenty of >>> room to store more? Are we missing another setting? Is there a way to tell >>> what the actual fielddata cache size is at runtime (maybe it did not pickup >>> the configuration setting for some reason)? >>> >>> Thanks, >>> Philippe >>> >> -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b12d4d45-94ab-4110-831a-0abd8a651a9b%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.