Re: configure analyzer with java api problem

2014-06-28 Thread Cédric Hourcade
Hello, Is your field user_visited configured to use your keywordAnalyzer analyzer ? Can you post your index mapping? Cédric Hourcade c...@wal.fr On Fri, Jun 27, 2014 at 11:21 AM, Helennie Nie helennie...@gmail.com wrote: Hi there A field in my documents is a url like : http

Re: Filtering on Script Value

2014-06-25 Thread Cédric Hourcade
Hello, You should be able to filter with a script using the script filter: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-script-filter.html Cédric Hourcade c...@wal.fr On Wed, Jun 25, 2014 at 4:36 AM, Brian Behling brian.behl...@gmail.com wrote: I'm trying

Re: function score query _all field ?

2014-06-25 Thread Cédric Hourcade
Hello, I think only the _boost parameter is deprecated. You can still set a boost per field in your mapping. Cédric Hourcade c...@wal.fr On Wed, Jun 25, 2014 at 2:13 AM, Sergey Pilypenko s.pilype...@nimble.com wrote: Hi! I notised that starting from 1.0.0rc1 _boost option is deprecated

Re: performance of multi_match

2014-06-25 Thread Cédric Hourcade
What about your sharding? Is it the same as with solr? Did you identify some particulier queries being slow? Can you compare the number of results returned between elasticsearch and solr? Cédric Hourcade c...@wal.fr On Wed, Jun 25, 2014 at 10:12 AM, Christoph Lingg c.li...@gmail.com wrote: Hi

Re: performance of multi_match

2014-06-25 Thread Cédric Hourcade
What about your sharding? Is it the same as with solr? I have 5 shards without replication (one node). Would it be faster if it were only one shard? Same with solr? Did you identify some particulier queries being slow? there is a general trend of all queries beeing slower, not only some

Re: No terms generated for trigram analyzer

2014-06-25 Thread Cédric Hourcade
In fact they are in the _all field, but not analyzed with your trigrams analyzer. Cédric Hourcade c...@wal.fr On Wed, Jun 25, 2014 at 3:12 PM, Andreas Falk adde.f...@gmail.com wrote: I understand that my fields aren't in the _all field and that's why my query fails. But shouldn't

Re: Elascticsearch scripting

2014-06-24 Thread Cédric Hourcade
} } } Cédric Hourcade c...@wal.fr On Tue, Jun 24, 2014 at 9:14 AM, deep saxena sandy100s...@gmail.com wrote: I am looking for the elegant solution where script can serve the purpose. Some of my conditions are like this :- changing the name of the field, Some script should run as part of the query

Re: No terms generated for trigram analyzer

2014-06-24 Thread Cédric Hourcade
/_search?q=jenanalyzer=trigramspretty=truedf=title' I think it works for jen* because it's converted into a wildcard query. For the termvectors, you have to enable them in your mapping: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#string Cédric

Re: performance of multi_match

2014-06-24 Thread Cédric Hourcade
with just the multi_match vs edismax, also compare the number of results. Ensure the cross_fields parameter is acting as you want, as you have lot of fields with maybe different analyzers. Cédric Hourcade c...@wal.fr On Tue, Jun 24, 2014 at 5:09 PM, Christoph Lingg c.li...@gmail.com wrote: Hi

Re: Bulk inserting is slow

2014-06-24 Thread Cédric Hourcade
/elasticsearch/blob/master/src/test/java/org/elasticsearch/action/bulk/BulkProcessorTests.java Cédric Hourcade c...@wal.fr On Tue, Jun 24, 2014 at 5:34 PM, Frederic Esnault esnault.frede...@gmail.com wrote: Hi again, any idea about how to parallelize the bulk insert process ? I tried creating

Re: How does shingle filter work on match_phrase in query phase?

2014-06-20 Thread Cédric Hourcade
) it will matches the document, because t1, t2 and t3 are considered next to each others (based on there recorded position) for this document. Cédric Hourcade c...@wal.fr On Fri, Jun 20, 2014 at 7:04 AM, 陳智清 walker0...@gmail.com wrote: How does shingle filter work on match_phrase in query phase? After

Re: 100% CPU on 1 Node with JMeter Tests

2014-06-20 Thread Cédric Hourcade
or the server itself. Cédric Hourcade c...@wal.fr On Thu, Jun 19, 2014 at 7:40 PM, sai...@roblox.com wrote: Bump On Wednesday, June 18, 2014 6:20:58 PM UTC-7, sai...@roblox.com wrote: One out of 4 nodes always spikes to 100% CPU when we do some load tests using JMeter (50 Threads, 50 Loops

Re: problem indexing with my analyzer

2014-06-20 Thread Cédric Hourcade
Does it mean your applying the reuters analyzer on your base64 encoded pictures? I guess it generates a really huge number of tokens for each entry because of your nGram filter (with a max at 250). Cédric Hourcade c...@wal.fr On Fri, Jun 20, 2014 at 9:09 AM, Tanguy Bernard bernardtanguy1

Re: ElasticSearch queries always return all the datas stored in the index

2014-06-20 Thread Cédric Hourcade
It looks like you are doing a GET rather than a POST, if so your query content is ignored. Cédric Hourcade c...@wal.fr On Fri, Jun 20, 2014 at 11:26 AM, Alexandre Touret alexan...@touret.info wrote: Yes My request for doe always return that answer Le vendredi 20 juin 2014 11:24:33 UTC+2

Re: ElasticSearch queries always return all the datas stored in the index

2014-06-20 Thread Cédric Hourcade
Ah yes sorry you are right, I am using some old tools :) Cédric Hourcade c...@wal.fr On Fri, Jun 20, 2014 at 11:49 AM, Alexandre Touret alexan...@touret.info wrote: I just upgraded to ES 1.2.1 and the latest release of mavel. I have the same behaviour Le vendredi 20 juin 2014 11:34:59 UTC

Re: How does shingle filter work on match_phrase in query phase?

2014-06-20 Thread Cédric Hourcade
t2 and t2 t3. It will match your document. Cédric Hourcade c...@wal.fr On Fri, Jun 20, 2014 at 12:30 PM, 陳智清 walker0...@gmail.com wrote: Hello Hourcade, Thanks for your response. Does that mean different values should be set to index_analyzer and search_analyzer? (e.g. index_analyzer: shingle

Re: problem indexing with my analyzer

2014-06-20 Thread Cédric Hourcade
If your base64 encodes are long, they are going to be splited in a lot of tokens by the standard tokenizer. Theses tokens are often going to be a lot longer than standard words, so your nGram filter will generate even more tokens, a lot more than with standard text. That may be your problem

Re: Query Performance

2014-06-19 Thread Cédric Hourcade
are different for each query, you may want to set the filter execution to bool (or fielddata ?) to cache the terms individually rather than just the combination of them : { terms : { execution: bool, ... } } Cédric Hourcade c...@wal.fr Le jeudi 19 juin 2014 15:12:43 UTC+2, ravim...@gmail.com a écrit