Hey all, I have a 3 node Elasticsearch 1.0.1 cluster running on Windows 
Server 2012 (in Azure).  There's about 20 million documents that take up a 
total of 40GB (including replicas).  There's about 400 indexes in total, 
with some having millions of documents and some having just a few.  Each 
index is set to have 3 shards and 1 replica.   The main cluster is running 
on three  4 core machines with 7GB of ram.  The min/max JVM heap size is 
set to 4GB.  

The primary use case for this cluster is faceting/aggregations over the 
documents.  There's almost no full text searching, so everything is pretty 
much based on exact values (which are stored but not analyzed at index time)

When doing some term facets on a few of these indexes (the biggest one 
contains about 8 million documents) I'm seeing really long response times 
(> 5 sec).  There are potentially thousands of distinct values for the term 
I'm faceting on, but I would have still expected faster performance.

So my goal is to speed up these queries to get the responses sub second if 
possible.  To that end I had some questions:
1) Would switching to Linux give me better performance in general?
2) I could collapse almost all of these 400 indexes in to a single big 
index and use aliases + filters instead.  Would this be advisable?
3) Would mucking with the field data cache yield any better results?


If I can add any more data to this discussion please let me know!
Thanks!
Eric

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/eb5fb6bf-be2c-4d5f-b73a-edc1ef5813f1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to