Re: Significant terms - avoiding out of memory errors

2014-09-05 Thread Kevin B
Christoffer, How much JVM heap are you giving ES and what are the size of the sets? According to this http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-fielddata.html it looks like in 1.4 you will be able to control the circuit breaker more via config. How

Re: elasticsearch processing pipeline capability?

2014-09-05 Thread Kevin B
index-termlist > > Jörg > > > On Tue, Aug 26, 2014 at 10:41 PM, Kevin B > wrote: > >> Is there any facility in elasticsearch to help with sending terms to an >> external processes after lucene processing (tokenization, filters, etc)? >> The idea he

elasticsearch processing pipeline capability?

2014-08-26 Thread Kevin B
Is there any facility in elasticsearch to help with sending terms to an external processes after lucene processing (tokenization, filters, etc)? The idea here is having some external analysis / nlp code run against the documents while keeping all the pre-processing choices consistent and in on

Listing available analyzers via API

2014-03-19 Thread Kevin B
The scenario I have is driving some index builds from an external application. As part of this an analyzer would be chosen in the external application. The intent here would be that a choice could be made from a list of all analyzers available in the ES installation whether distributed with E

Re: Calculating a "computed" value based on index statistics / term frequencies

2014-03-04 Thread Kevin B
Quick correction. I remembered precomputing prior to population of the index wouldn't work for me in this case because there wouldn't be the term frequency data for the full corpus. On Tuesday, March 4, 2014 11:56:04 AM UTC+2, Kevin B wrote: > > As background I have some L