Details:
Elastic Search version used: 1.3.4
Docs to index: ~ 2.2 Million
Growth in docs: few 100 docs every week.
Number of fields per doc: ~10-15
tokenizers used: ngram (min:2, max:15), path_hierarchy
filters used: word_delimiter, pattern_capture, lowercase, unique
Size on disk: ~ 150 GB (No replicas are active)

Problem:
Unfortunately, I don't have the luxury of a lot of free disk space at my 
disposal.
Why? [Let me just say I work for a too big-to-fail organizations, if you 
know what I mean :-)]
I need to reduce my index storage footprint by at least 50%.

Solutions tried:
1. run _flush & _optimize on the index. Didn't affect the size on disk.
2. decrease the number of primary shards from 5 to 2 (realized this is a 
useless attempt as number of shards doesn't affect disk space)
3. Looked into archiving the index after closing (can't use this solution 
as I want our users to search through all of the 2.2 Million docs, so can't 
archive partial docs)

Can you guys suggest any other options to reduce index disk size?
Your inputs are much appreciated.

Thanks,
Parth Gandhi

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e343c173-da25-4281-8909-cea62cfdf6f3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to