which is the fastest client that could handle most requests per second? (any benchmarks?)

2015-03-29 Thread MrBu
Hello guys, Thanks for the community support here :) We would like to use a client for our nodes. However, we are limited to the usage of nginx. We dont have the full control over using nginx. So we wanted to use one of the official/unofficial ES client. I have searched, but all I could find

Re: Inconsistent results (Preference = Custom (string) UserId)

2015-03-29 Thread Masaru Hasegawa
Hi, Is the number of hits the same every time? And are those documents have the same score? If so, the behavior is expected. Elasticsearch (Lucene) uses Lucene's internal document ID when score is the same. You can supply secondary sort criterion like _uid to make order consistent. Masaru On

Re: How does Elasticsearch calculate the field-length norm?

2015-03-29 Thread Xudong You
Thanks Masaru, so it is the precision loss issue by encoding/decoding. That makes sense. On Friday, March 27, 2015 at 11:04:12 AM UTC+8, Masaru Hasegawa wrote: Hi, I believe it's because field norm is encoded in single byte. See http://lucene.apache.org/core/4_10_2/core/index.html

what are the research papers that ES relies on?

2015-03-29 Thread MrBu
Other than Lucene's own research papers, what are the research papers or special algorithms that is being used by Elastic? I couldn't find a list it in the documents. Are the special algorithms used (and which ones are used in where) for example what is the algorithm used in in load

[ANN] Elasticsearch Servlet Transport plugin 2.5.0 released

2015-03-29 Thread Elasticsearch Team
Heya, We are pleased to announce the release of the Elasticsearch Servlet Transport plugin, version 2.5.0. The wares transport plugin allows to use the REST interface over servlets.. https://github.com/elastic/elasticsearch-transport-wares/ Release Notes - elasticsearch-transport-wares -

Re: Unexpected high CPU / IO loading

2015-03-29 Thread joergpra...@gmail.com
1. If you are not sure about merging, you should look for other reasons for high load. Identify the processes with high activity. Check if your storage I/O system can keep up. 2. You can not turn off merging. With indices.store.throttle.type: none you diasble throttling. 3. Optimizing by manual

Re: [node1] failed to reconnect to node [node1]

2015-03-29 Thread Azman Ahmad
Yes, the plan was to decommision all vms in digitalocean. Before doing so, all data will be re-populated during clustering into aws then retire vms in digitalocean one by one. In aws i hv private public ip while in Digitalocean all public ip. Thanks. On Sunday, March 29, 2015 at 3:48:05 PM

Re: How getting document match rate?

2015-03-29 Thread Terra Sacer
Is this not possible? On Saturday, March 28, 2015 at 6:33:53 PM UTC+2, Terra Sacer wrote: Hello everyone, For example my data [ { id : 1, type : article, title : About the Java Technology }, { id : 2, type : article, title : How does ElasticSearch work }, { id : 3, type

Re: Timestamps stored as seconds since Unix epoch

2015-03-29 Thread Jean Marc Saffroy
Here is a curl recreation, hopefully that will be clearer: http://pastebin.com/DUhbpgze JM On Sun, Mar 29, 2015 at 4:18 PM, Jean Marc Saffroy j...@scality.com wrote: Of course I left a typo in my email: I do use the same field name across mapping def, docs and queries, and it does not work.

Re: Timestamps stored as seconds since Unix epoch

2015-03-29 Thread Jean Marc Saffroy
Of course I left a typo in my email: I do use the same field name across mapping def, docs and queries, and it does not work. JM On Sun, Mar 29, 2015 at 4:17 PM, Jean Marc Saffroy j...@scality.com wrote: Hi all, Not sure what I'm doing wrong, but I couldn't find a way to store my docs with

Timestamps stored as seconds since Unix epoch

2015-03-29 Thread Jean Marc Saffroy
Hi all, Not sure what I'm doing wrong, but I couldn't find a way to store my docs with timestamps as seconds since the Unix epoch and query them properly. I have my date/time field mapped like this: start_time:{type: date }, I store documents like this: { start_time: 1427631731, ...

Add non existing settings in cluster settings in elasticsearch

2015-03-29 Thread Ayman
Hi there, Recently i have built elasticsearch with fluentd and Kibana and all is working fine, but we are facing high cpu load while performing search caused by java. I have 60 GB Ram and 16 processors and each processor has 2 cores. ES_HEAP_SIZE=32g -find below more details curl

Re: [node1] failed to reconnect to node [node1]

2015-03-29 Thread mohdjohari
unicast. here discovery.zen.ping.multicast.enabled: false discovery.zen.ping.unicast.hosts : [10.0.0.187:9300,x.x.x.x:9300,x.x.x.x:9300] *x.x.x.x = VMs in digitalocean #discovery.zen.ping.unicast.hosts : [10.0.0.187] discovery.zen.ping.timeout: 180s discovery.zen.minimum_master_nodes: 2 just to

Re: Japanese Search Results with Kuromoji plugin

2015-03-29 Thread Mangesh Ralegankar
Thanks, I was expecting result to be zero when normal mode is on but it tokenized it anyway. Thanks Mangesh R On Saturday, March 28, 2015 at 12:00:43 PM UTC+5:30, Masaru Hasegawa wrote: Matching is done on term basis not character basis(like grep). Since kuromoji splits terms on white

Re: [node1] failed to reconnect to node [node1]

2015-03-29 Thread Mark Walkom
You're building a cluster between your AWS instances and your DigitalOcean ones, am I understanding this correctly? On 29 March 2015 at 11:56, Azman Ahmad lesdesprado2...@gmail.com wrote: unicast. here discovery.zen.ping.multicast.enabled: false discovery.zen.ping.unicast.hosts :

Re: Unexpected high CPU / IO loading

2015-03-29 Thread Mark Walkom
How high? ES will do things int he background (mostly around lucene merging), but is it causing a problem? On 29 March 2015 at 13:28, Felix Ng felixn...@gmail.com wrote: My cluster have unexpected high loading even there is no query / indexing recently. I suspect it is the segment merging.

Re: How getting document match rate?

2015-03-29 Thread Adrien Grand
Hi Terra, This information is not available. For debugging purposes though, you can use 'explain' to have an explanation of how the score was computed, which will also tell you which clauses matched: http://www.elastic.co/guide/en/elasticsearch/reference/current/search-explain.html On Sun, Mar

Re: Timestamps stored as seconds since Unix epoch

2015-03-29 Thread Mark Walkom
Unix epoch isn't a supported format. Take a look at http://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-date-format.html On 30 March 2015 at 02:12, Jean Marc Saffroy j...@scality.com wrote: Here is a curl recreation, hopefully that will be clearer:

Re: ESLucene 32GB heap myth or fact?

2015-03-29 Thread Adrien Grand
On Thu, Mar 26, 2015 at 2:28 PM, Paweł Róg pro...@gmail.com wrote: When we sum all of sizes we can see 13.5GB of primitive arrays pointed by less than 20M references. As we can see ESLucene use a lot of arrays of primitives. I think it depends what takes memory in your heap. For instance,

Re: Add non existing settings in cluster settings in elasticsearch

2015-03-29 Thread Mark Walkom
You should drop your heap to 31GB, look for java pointer compression if you want to know more. If your threadpools are filling up then your node is overloaded and you need more nodes. Take a look at

Re: How getting document match rate?

2015-03-29 Thread Mark Walkom
The scoring is relative to the other documents and as such there is no such thing as a 100% match. On 30 March 2015 at 01:39, Terra Sacer terrasa...@gmail.com wrote: Is this not possible? On Saturday, March 28, 2015 at 6:33:53 PM UTC+2, Terra Sacer wrote: Hello everyone, For example my

Re: [node1] failed to reconnect to node [node1]

2015-03-29 Thread Mark Walkom
If you are trying to talk from your DO instances on a public IP to your AWS instance on a private IP, it's not going to work. You should really leverage snapshot and restore instead of doing this. On 30 March 2015 at 02:06, Azman Ahmad lesdesprado2...@gmail.com wrote: Additional info, the

ES/Lucene eating up entire memory!

2015-03-29 Thread Yogesh
Hi, I have a single node ES setup (50GB memory, 500GB disk, 4 cores) and I run the Twitter river on it. I've set the ES_HEAP_SIZE to 5g. However, when I do top, the ES process shows the VIRT memory to be around 34g. That would be I assume the max mapped memory. The %MEM though always hovers

Re: ES/Lucene eating up entire memory!

2015-03-29 Thread Uwe Schindler
You should read: http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html Maybe this allows you to figure out what's going on! VIRT means nothing about consumption, you should look at RES. Thanks, Uwe Am Sonntag, 29. März 2015 22:23:00 UTC+2 schrieb Yogesh: Hi, I have a

Re: ES/Lucene eating up entire memory!

2015-03-29 Thread Robert Muir
Do you know what virtual memory is? You have terabytes of it. On Sun, Mar 29, 2015 at 4:22 PM, Yogesh bindasyog...@gmail.com wrote: Hi, I have a single node ES setup (50GB memory, 500GB disk, 4 cores) and I run the Twitter river on it. I've set the ES_HEAP_SIZE to 5g. However, when I do

Re: Timestamps stored as seconds since Unix epoch

2015-03-29 Thread Jean Marc Saffroy
I thought I might have missed something, and it surprises me that the Unix epoch isn't supported! Well, I guess I'll have to work around that. Thanks! JM On Sun, Mar 29, 2015 at 10:21 PM, Mark Walkom markwal...@gmail.com wrote: Unix epoch isn't a supported format. Take a look at