Elasticsearch script score and decay function scores do not multiply

2014-12-16 Thread valerij . vasilcenko
query: { function_score: { filter : { bool : { must : [ { terms : { content : test} } ] } }, functions: [{ exp: { date: { origin: now,

Re: Completion Suggesters using Java API

2014-12-16 Thread joergpra...@gmail.com
Please look at the Elasticsearch unit tests, here https://github.com/elasticsearch/elasticsearch/blob/master/src/test/java/org/elasticsearch/search/suggest/CompletionSuggestSearchTests.java Jörg On Mon, Dec 15, 2014 at 11:44 PM, Pradeep B bpradeep.m...@gmail.com wrote: Hi In the blog post

Re: Elasticsearch script score and decay function scores do not multiply

2014-12-16 Thread valerij . vasilcenko
There was on error in a query. It should be: {exp: { date: { origin: now, scale: 1d, decay : 0.05 } }}, {script_score: { script: _score * 10, lang:groovy }} On Tuesday, December 16, 2014 10:24:50 AM UTC+2,

Aggregegation buckets count

2014-12-16 Thread Tom
Hi, is there a way to get just the count of buckets (not the count of docs, which works i know) of an aggregation without receiving the whole buckets content? thx, Tom -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this

[Kibana] group by request?

2014-12-16 Thread stephanos
Hey there, we are using Google App Engine to host our SaaS app. Google offers a nice log browser but it is way too slw. So one of my colleagues suggested we pipe our logs to logstash and make them accessible via Kibana. So far so good, we managed to set everything up. But when Kibana was

Write amplification and SSD

2014-12-16 Thread AndrewK
At the core training I attended last year there was a side note on SSD and write amplification: roughly along the lines of: write amplification can be a big problem with SSD (as writes can be around 4KB but deletes are often in blocks of around 512KB, and that the problem gets worse the smaller

Re: Write amplification and SSD

2014-12-16 Thread Michael McCandless
It means that ES works well with SSDs since Lucene is write-once under the hood, so it is easy on the SSDs, vs other approaches which do random writes to different places causing the higher write amplification. But, this is balanced with the fact that Lucene must also periodically merge the

Re: Write amplification and SSD

2014-12-16 Thread joergpra...@gmail.com
All SSDs have internal rewrites due to wear leveling and garbage collection, and the issue is not only caused by random writes from the application, but that too many internal rewrites reduce SSD performance and lifetime. I think the contribution of reducing write amplification from application

Re: Aggregegation buckets count

2014-12-16 Thread Rich Somerfield
Hi Tom, I think the Cardinality aggregation is what you want. e.g. : { ...query... }, aggregations: { totalUniqueUsers: { cardinality: { field: username } } } -Rich On Tuesday, December 16, 2014 8:48:51 AM UTC, Tom wrote: Hi, is there a way to get just the count of

File Descriptors

2014-12-16 Thread Chetan Dev
Hi, What does file descriptor value of -1 shows? what is the default value for it ? Thanks }, WkgDi0joTYSrs5sO3_bndQ : { name : AEPLPERF2, transport_address : inet[/192.168.1.13:9300], host : AEPLPERF1, ip : 192.168.1.13, version : 1.4.1, build :

Re: Have a problem to map lat long field

2014-12-16 Thread Anoop P R
Thanks David, Now its works for me. On Tuesday, December 16, 2014 12:55:45 PM UTC+5:30, David Pilato wrote: There is no autodetection of geo_point. So you need to provide first a mapping before sending the first document. David Le 16 déc. 2014 à 07:47, Anoop P R anoopkuma...@gmail.com

Re: Aggregegation buckets count

2014-12-16 Thread Tom
Hi Rich, perfect, that's it, thx a lot. Cheers, Tom Am Dienstag, 16. Dezember 2014 11:02:04 UTC+1 schrieb Rich Somerfield: Hi Tom, I think the Cardinality aggregation is what you want. e.g. : { ...query... }, aggregations: { totalUniqueUsers: { cardinality: { field:

processors configuration (available_processors)

2014-12-16 Thread Robin Clarke
I have a machine with 132GB of memory and 32 cores on which I am running two elasticsearch nodes. Each node should have only half the total number of CPU cores available so that both nodes can work at full capacity and not block each other. I believe the correct configuration option would be:

Re: processors configuration (available_processors)

2014-12-16 Thread Robin Clarke
And in the spirit of answering your own questions, I've already been helped to the answer: curl -XGET http://localhost:9200/_nodes/thread_pool?pretty http://localhost:9200/_nodes/os?pretty ... index : { type : fixed, min : 16, max : 16,

Re: processors configuration (available_processors)

2014-12-16 Thread joergpra...@gmail.com
You have to set up a container and assign virtual processors to it, and install ES in there so the Java JVM can use only these cores. With 16 cores, you have 32 processors because of Intel's hyperthreading technology which doubles the number of available processing units in a core. Thread pool

Re: Write amplification and SSD

2014-12-16 Thread AndrewK
Thank you for the feedback! -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit

Re: ElasticSearch hadoop - .EsHadoopSerializationException

2014-12-16 Thread Costin Leau
Having multiple types shouldn't be an issue - ES is a document store so it's pretty common to have different types. In other words, this is not the intended behavior - can you please create a small sample/snippet that reproduces the error and raise an issue for it [1] ? Thanks! [1]

Help with request

2014-12-16 Thread Валерий Хвалев
Hi, all I'm newbe in Elastic. Usualy I was use Sphinx in my projects but due some limitation I descide to play with Elastic and I like it. But now I stuck in some thin request and need your help. So could you please push me in right direction how to make such request: I have index (sample)

Re: Aggregegation buckets count

2014-12-16 Thread Rich Somerfield
Happy to help. On Tuesday, December 16, 2014 10:37:25 AM UTC, Tom wrote: Hi Rich, perfect, that's it, thx a lot. Cheers, Tom Am Dienstag, 16. Dezember 2014 11:02:04 UTC+1 schrieb Rich Somerfield: Hi Tom, I think the Cardinality aggregation is what you want. e.g. : { ...query...

Re: Return Logstash Failed User logons by day and return code.

2014-12-16 Thread Rod Clayton
Dear Sachin, Here is the GIST with the output you requested: ka3bhy https://gist.github.com/ka3bhy / *gist:082a5410d36264521ccb https://gist.github.com/ka3bhy/082a5410d36264521ccb* On Monday, December 15, 2014 10:13:02 PM UTC-5, Sachin Divekar wrote: Hi, Share output of

Re: Is ElasticSearch truly scalable for analytics?

2014-12-16 Thread Adrien Grand
What you mean with node level reduce? On Mon, Dec 15, 2014 at 10:52 PM, Yifan Wang yifan.wang@gmail.com wrote: If I understand correctly, ElasticSearch directly sends query to and collects aggregated results from each shard. With number of shards increases, Reduce phase on the Client node

Re: Return Logstash Failed User logons by day and return code.

2014-12-16 Thread Sachin Divekar
Hi Rod, Try following URL http://localhost:9200/_search?q=status: FAILUREpretty In output you will find something like following hits: { total: 7, max_score: 1, hits: [ - So in hits block value of total

Recover data from corrupted index

2014-12-16 Thread Octavian
Hy, I have a corrupted index in an elasticsearch cluster. The index is corrupted due to bad mappings. As you can see in example, there are two fields with same name and different mappings - one is string/doc_values and one is date, and in Lucene this is not possible, due to doc_values

How to use json in update script

2014-12-16 Thread Roger de Cordova Farias
Hello I'm trying to update a document whose root object contains a list of nested objects. I need to send an object of the nested type as a script parameter to append to the list How can I append the json (a string type) to the nested objects list of the root object using Groovy? or should I use

Re: Return Logstash Failed User logons by day and return code.

2014-12-16 Thread Rod Clayton
Dear Sachin, I want to aggregate them by username and workstation and get a count. I need to produce a report if there are too many failures for an account. I figured out how to limit the search to a particular day by saying

Where is the data stored? ElasticSearch YARN

2014-12-16 Thread Rafael Pellon
Hi We are testing elasticsearch in a HDP environment using YARN. We follow the instructions in the link http://www.elasticsearch.org/blog/elasticsearch-yarn-and-ssl/ and upload a lot of data but Where is the data stored? Is it in local file system / HDFS? Is it persisted? What is the

Re: AWS machine for ES master

2014-12-16 Thread Jeff Keller
Just curious, what version of ES are you running? Thanks, Jeff On Sunday, December 14, 2014 10:48:07 AM UTC-5, Yoav Melamed wrote: Thanks On Sunday, December 14, 2014 11:01:58 AM UTC+2, Yoav Melamed wrote: Hello, I run Elasticsearch cluser in AWS based on c3.8xlarge machines. Can I

Re: Where is the data stored? ElasticSearch YARN

2014-12-16 Thread Costin Leau
I recommend reading the project documentation [1]; there's a dedicated section that covers storage [2]. [1] http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/master/index.html [2] http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/master/ey-setup.html#_storage On 12/16/14

Re: How to use json in update script

2014-12-16 Thread Roger de Cordova Farias
Ok, I found out that I can send a JSON as a script parameter and just append it to the nested objects list (with list += newObject or list.add(newObject) ) using groovy and it works But it is not working with the Java API, I can only get it to work using the REST API. When using Java the JSON is

Re: How to use json in update script

2014-12-16 Thread Roger de Cordova Farias
*Does not work (using JAVA API):* String script = ctx._source.objectsList += newObject; UpdateRequestBuilder prepareUpdate = client.prepareUpdate(indexName, typeName, id); prepareUpdate.setScriptLang(groovy); prepareUpdate.setScript(script, ScriptType.INLINE);

Re: Return Logstash Failed User logons by day and return code.

2014-12-16 Thread Sachin Divekar
Hi Rod, What you need to use is multi level terms aggregation. General format of such query is as following. { aggs: { agg1: { terms: { field: field1 }, aggs: { agg2: { terms: { field: field2 }, aggs: { agg3: { terms: { field: field3 } } } } } } } } In your case you can use fleeing query {

Is ElasticSearch truly scalable for analytics?

2014-12-16 Thread AlexR
ES already doing aggregations on each node. it is not like it is shipping row level query data back to master for aggregation. In fact, one unpleasant effect of it is that aggregation results are not guaranteed to be precise due to distributed nature of the aggregation for multibucket aggs

Re: Is ElasticSearch truly scalable for analytics?

2014-12-16 Thread Elvar Böðvarsson
Elasticsearch supports tribe nodes, so you can combine multiple clusters, you then query the tribe node to access data on all of them. On Monday, December 15, 2014 9:52:45 PM UTC, Yifan Wang wrote: If I understand correctly, ElasticSearch directly sends query to and collects aggregated

Re: Same query, different CPU util when run with Java API versus REST

2014-12-16 Thread Marie Jacob
I have this same issue, except that CPU utilization is approximately the same, but response times are very different. I'm using ES 1.3.4. I have two JMeter tests simulating concurrent tests with the exact same configuration. The test using the Java Sampler with the Transport Client are showing

zen disco socket usage and port

2014-12-16 Thread Paul Baclace
Is it normal for a single node elasticsearch process to open 13 sockets to itself? This seems like an excessive zen disco party and inexplicable. I am trying out v1.4. Is it possible to set the transport protocol port? -- You received this message because you are subscribed to the Google

Re: File Descriptors

2014-12-16 Thread Andrew Selden
What OS are you on? My guess would be that the library (sigar) that reads that value can’t find it. On Dec 16, 2014, at 2:15 AM, Chetan Dev cheten@carwale.com wrote: Hi, What does file descriptor value of -1 shows? what is the default value for it ? Thanks },

zen disco socket usage and port

2014-12-16 Thread Paul Baclace
Is it normal for a single node elasticsearch process to open 13 sockets to itself? This seems like an excessive zen disco party and inexplicable. I am trying out v1.4. Is it possible to set the transport protocol port? -- You received this message because you are subscribed to the Google

Re: Same query, different CPU util when run with Java API versus REST

2014-12-16 Thread Jeff Potts
Yes, getReadClient() gets a Node Client that is instantiated by Spring and then injected as a dependency. I have tried the Transport Client as well and it makes no difference. An interesting finding is that I have isolated the performance degradation to the Range filter. When my service has

Re: Same query, different CPU util when run with Java API versus REST

2014-12-16 Thread Jeff Potts
Scratch what I said about total shard search query time. I had started a client node and was pointing my REST test at that. CPU difference still there. Also, I've upgraded to 1.3.7. Jeff -- You received this message because you are subscribed to the Google Groups elasticsearch group. To

Re: zen disco socket usage and port

2014-12-16 Thread Paul Baclace
More info: I tried to set the transport port like this: -Dtransport.tcp.port= on the elasticsearch command line, but it still uses port 9300. On Tuesday, December 16, 2014 12:42:12 PM UTC-8, Paul Baclace wrote: Is it normal for a single node elasticsearch process to open 13 sockets to

High Disk Watermark exceeded on one or more nodes

2014-12-16 Thread Pauline Kelly
I'm running an elk + redis stack on this machine, and just started collecting eventlogs via GELF from a windows server. I had a look at the logs recently, and this came up: [2014-12-17 09:31:03,820][WARN ][cluster.routing.allocation.decider] [logstash test] high disk watermark [10%] exceeded

Re: Same query, different CPU util when run with Java API versus REST

2014-12-16 Thread joergpra...@gmail.com
The difference between DSL and Java is that in Java you use search type QUERY_AND_FETCH which is slow. Jörg On Mon, Dec 15, 2014 at 3:27 PM, Jeff Potts jeffpott...@gmail.com wrote: Yes, updated the gist. Thanks for taking a look at this. Jeff On Sunday, December 14, 2014 11:52:35 AM UTC-6,

Re: zen disco socket usage and port

2014-12-16 Thread joergpra...@gmail.com
ES uses a netty worker pool in order to be able to connect to multiples nodes. The size of the pool does not automatically take into consideration that you want a single node cluster only, it does not shrink or grow. You can reduce the size of the netty worker pool with the setting

Re: zen disco socket usage and port

2014-12-16 Thread Paul Baclace
Now that you pointed out the addition es. prefix needed when specifying on the command line, I can see that: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/setup-configuration.html shows it via color in an example and with instead of. Apparently, that was not obvious

FiltrES - A language that compiles to ElasticSearch Query DSL

2014-12-16 Thread Abe Haskins
Hi folks! I wanted to share FiltrES.js https://github.com/abeisgreat/FiltrES.js, a tool for compiling simple human readable expressions (i.e. '(height = 73 or (favorites.color == green and height != 73)) and firstname ~= o.+)') into ES queries. This is useful for times when you want end users

Re: Same query, different CPU util when run with Java API versus REST

2014-12-16 Thread Jeff Potts
I figured it out. The queries executed by the straight Elasticsearch REST test and the Java test are not exactly identical. The difference is that the current date used by the straight Elasticsearch test is pulled from a randomized CSV file whereas the Java service always injects the actual

Re: duplicate data on hadoop

2014-12-16 Thread Mungeol Heo
Try to use one index instead of multiple indexes. For instance, change 'es.resource' = 'event-conversion*/conversion' to 'es.resource' = 'event-conversion-01/conversion'. I think es-hadoop does not support multiple indexes setting for now. On Wed, Nov 19, 2014 at 11:57 AM, Jimmy Carter

User free text query

2014-12-16 Thread Bruno Kamiche
I need to query elasticsearch and let the user specify on which other fields to search for certain attributes... I have a field named texto with the actual text on which queries are done. I have a field name category with values like 0001,0003 (meaning the record is on categories 1 and 3), or

ES shard placement on different nodes

2014-12-16 Thread Nidhi Gupta
Hi, I have multi index ES cluster. I read ES tries to equally distributes the shards on different nodes. Question : ES tries the equal distribution across all nodes of primary+replica shards per index or primary shards per index and then distribute replica shards per index? so its it

Re: ES shard placement on different nodes

2014-12-16 Thread David Pilato
Yes it's possible but you should not be afraid of this. At the end, both primary and replicas will do the same index/search operation. HTH David Le 17 déc. 2014 à 02:29, Nidhi Gupta gupta.nidh...@gmail.com a écrit : Hi, I have multi index ES cluster. I read ES tries to equally

Re: User free text query

2014-12-16 Thread David Pilato
Actually, elasticsearch will search computer in _all field and as you said 0003 in category field. You should may be disable _all field and use the copy_to feature instead BTW. If your interface has different inputs for text and category, then you should probably better use the QueryDSL with

Re: User free text query

2014-12-16 Thread Bruno Kamiche
My application interface offers fields for filtering (and they are working on elastic), but for power users there is always the free text query option, my query in elastic is as follows: query : { filtered : { query : {

Re: User free text query

2014-12-16 Thread David Pilato
Replace match with a simpleQueryString query. David Le 17 déc. 2014 à 03:13, Bruno Kamiche bkami...@gmail.com a écrit : My application interface offers fields for filtering (and they are working on elastic), but for power users there is always the free text query option, my query in

Re: Search sort using a field with an index_name results in No mapping found

2014-12-16 Thread Dave Reed
Just in the interest of having a two-way link, I opened a github issue about it here: https://github.com/elasticsearch/elasticsearch/issues/8980 On Monday, December 15, 2014 9:41:01 AM UTC-8, Dave Reed wrote: I see you tried adding index_name to the inner field as well. Nope, I'm afraid

Re: User free text query

2014-12-16 Thread Bruno Kamiche
Ok, it works, but I've found a new problem, it only works if the field category is exactly 0003, it seems that is not taking in account that some records have that field as 0001,0003 or value1,value2,value3,...,0003,0005 and alike... On Tuesday, December 16, 2014 9:22:22 PM UTC-5, David

elk cluster plan with 7000EPS an 100/s search

2014-12-16 Thread Wang Yong
Hi folks, I am building an elk cluster to index and search lots of http access log, about more than 7000Event per second and also there will be more than 100 cocurrent searchs. I have 2 machines. One of them has 24 cpu cores, 64G memory and 2T sata disk(no raid). The other one is much

Re: User free text query

2014-12-16 Thread Bruno Kamiche
Any pointers on that? On Tuesday, December 16, 2014 10:53:26 PM UTC-5, David Pilato wrote: It’s an analysis issue I believe. Mostly depends on your mapping for this field. -- *David Pilato* | *Technical Advocate* | *Elasticsearch.com http://Elasticsearch.com* @dadoonet

Re: User free text query

2014-12-16 Thread David Pilato
I don’t know if you are aware of analysis process. If not, you could start reading this: http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/mapping-analysis.html http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/mapping-analysis.html -- David Pilato | Technical

Re: $ES_HEAP_SIZE

2014-12-16 Thread 潘飞
在 2013年2月13日星期三UTC+8上午6时43分40秒,kimchy写道: Note, even if you run just one ES instance, on a 128gb, with 30gb allocated to ES, you potentially don't loose much unless you really need ES to have more memory. Depending on your index size, the file system cache will kick in and use whatever it

Re: FiltrES - A language that compiles to ElasticSearch Query DSL

2014-12-16 Thread David Pilato
Cool stuff Abe. Just a note, I think it’s a nice feature for developers but I’m more concerned with end users. A dev knows that == means equal but a end user will use field=x Same for different. != does not mean anything for a non developer. My 2 cents. -- David Pilato | Technical Advocate

Re: File Descriptors

2014-12-16 Thread Chetan Dev
Hi, I am on windows Thanks On Wednesday, December 17, 2014 2:12:16 AM UTC+5:30, Andrew Selden wrote: What OS are you on? My guess would be that the library (sigar) that reads that value can’t find it. On Dec 16, 2014, at 2:15 AM, Chetan Dev chete...@carwale.com javascript: wrote:

Re: High Disk Watermark exceeded on one or more nodes

2014-12-16 Thread Mark Walkom
It looks like this - http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-allocation.html#disk What is your actual disk usage? Can you run a curl -XGET localhost:9200/_cluster/settings and see if it mentions those settings? On 16 December 2014 at 23:28, Pauline