ngram min=2 kills your index space. Use min=3 or higher. Also maybe edge ngram tokenizer might be an alternative.
Jörg On Sat, Oct 18, 2014 at 12:06 AM, PARTH GANDHI <parth.gandh...@gmail.com> wrote: > Details: > Elastic Search version used: 1.3.4 > Docs to index: ~ 2.2 Million > Growth in docs: few 100 docs every week. > Number of fields per doc: ~10-15 > tokenizers used: ngram (min:2, max:15), path_hierarchy > filters used: word_delimiter, pattern_capture, lowercase, unique > Size on disk: ~ 150 GB (No replicas are active) > > Problem: > Unfortunately, I don't have the luxury of a lot of free disk space at my > disposal. > Why? [Let me just say I work for a too big-to-fail organizations, if you > know what I mean :-)] > I need to reduce my index storage footprint by at least 50%. > > Solutions tried: > 1. run _flush & _optimize on the index. Didn't affect the size on disk. > 2. decrease the number of primary shards from 5 to 2 (realized this is a > useless attempt as number of shards doesn't affect disk space) > 3. Looked into archiving the index after closing (can't use this solution > as I want our users to search through all of the 2.2 Million docs, so can't > archive partial docs) > > Can you guys suggest any other options to reduce index disk size? > Your inputs are much appreciated. > > Thanks, > Parth Gandhi > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to elasticsearch+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/e343c173-da25-4281-8909-cea62cfdf6f3%40googlegroups.com > <https://groups.google.com/d/msgid/elasticsearch/e343c173-da25-4281-8909-cea62cfdf6f3%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGvF%2BnExCR-%3DCr5Z1zdMQdMvaNbNw3q44Gg2_sZTZgJQA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.