See
http://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-length-tokenfilter.html

--

Itamar Syn-Hershko
http://code972.com | @synhershko <https://twitter.com/synhershko>
Freelance Developer & Consultant
Lucene.NET committer and PMC member

On Thu, Mar 12, 2015 at 10:52 AM, Bernhard Berger <
bernhardberger3...@gmail.com> wrote:

> Hi,
>
> while indexing various comments from Facebook I sometimes get Exceptions:
>
> IllegalArgumentException: Document contains at least one immense term...
>
> Is it possible to sanitize a text for indexing in Elasticsearch so it doesn't 
> throw these Exceptions? Maybe there is a Filter to remove too-long Unicode 
> terms?
>
> For details about the failing documents, see my (unanswered) Stackoverflow 
> question: 
> http://stackoverflow.com/questions/28941570/remove-long-unicode-terms-from-string-in-java
> (I fear to break another Elasticsearch-based (Maillist) crawler, so I better 
> don't write the failing doc text here ;-) )
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/93a5ed0d-6486-48b4-a228-1aff47d14ce0%40googlegroups.com
> <https://groups.google.com/d/msgid/elasticsearch/93a5ed0d-6486-48b4-a228-1aff47d14ce0%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZtqBSYcM9oFRa%3DGsWeafzHsE%3DSVMSa6H9e1aVfDbS2q%3Dg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to