ngram min=2 kills your index space. Use min=3 or higher. Also maybe edge
ngram tokenizer might be an alternative.

Jörg

On Sat, Oct 18, 2014 at 12:06 AM, PARTH GANDHI <parth.gandh...@gmail.com>
wrote:

> Details:
> Elastic Search version used: 1.3.4
> Docs to index: ~ 2.2 Million
> Growth in docs: few 100 docs every week.
> Number of fields per doc: ~10-15
> tokenizers used: ngram (min:2, max:15), path_hierarchy
> filters used: word_delimiter, pattern_capture, lowercase, unique
> Size on disk: ~ 150 GB (No replicas are active)
>
> Problem:
> Unfortunately, I don't have the luxury of a lot of free disk space at my
> disposal.
> Why? [Let me just say I work for a too big-to-fail organizations, if you
> know what I mean :-)]
> I need to reduce my index storage footprint by at least 50%.
>
> Solutions tried:
> 1. run _flush & _optimize on the index. Didn't affect the size on disk.
> 2. decrease the number of primary shards from 5 to 2 (realized this is a
> useless attempt as number of shards doesn't affect disk space)
> 3. Looked into archiving the index after closing (can't use this solution
> as I want our users to search through all of the 2.2 Million docs, so can't
> archive partial docs)
>
> Can you guys suggest any other options to reduce index disk size?
> Your inputs are much appreciated.
>
> Thanks,
> Parth Gandhi
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/e343c173-da25-4281-8909-cea62cfdf6f3%40googlegroups.com
> <https://groups.google.com/d/msgid/elasticsearch/e343c173-da25-4281-8909-cea62cfdf6f3%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGvF%2BnExCR-%3DCr5Z1zdMQdMvaNbNw3q44Gg2_sZTZgJQA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to