Need help/suggestion for the massive queries user case

2014-09-01 Thread Yuheng Du
Hi guys, I have some streaming sensor data as input to ES. For each incoming data message, I need to do a query on the historic data in ES according to the 'timestamp' and 'messageId' in that message. I need to get the aggregated query results in real-time. My problem is each data message may

Re: Get distinct data

2014-09-01 Thread vineeth mohan
Hello Alex , Term aggregation is here to save your day - http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#search-aggregations-bucket-terms-aggregation Thanks Vineeth On Tue, Sep 2, 2014 at 12:07 PM, Alex T wrote:

Re: How to Speed Up Indexing

2014-09-01 Thread xiehaiwei
Hi, "If ’Word Segmentation‘ is the problem" - means, word segmentation analyzer speed is not good, about 1MB/s when runs independently. In our case, many fields of a document need to be segment. "more machines with a shard" - Will a shard be running in multi nodes? Do yo

Get distinct data

2014-09-01 Thread Alex T
Hi all! I have problem with getting unique data from elasticsearch. I have the following documents: [ { "message": "Message 1", "author": { "id": 4, "name": "Author Name" }, "sourceId": "123456789", "userId": "123456" }, { "message": "Message 1", "author": { "id": 4, "name": "A

Re: Transport Client connectedNodes() duplicates

2014-09-01 Thread David Pilato
Interesting. May be you could open an issue for that? Something like "Transport Client with sniff duplicates nodes"? -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 2 sept. 2014 à 04:50, Stefan Will a écrit : Hi, for testing purposes, I've started up a stand-alone Elast

Re: How to Speed Up Indexing

2014-09-01 Thread vineeth mohan
Hello , One tip from my experience - 1. Disable refresh before bulk indexing and enable it once its done. ES waits for 1 second and then make all documents which are indexed during that time , searchable. - http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indice

How to Speed Up Indexing

2014-09-01 Thread xiehaiwei
Hi all, In our ES system, one line of a Mysql table will be indexing as a document, but indexing speed is slow. My Questions: 1) how fast of using BulkAPI indexing compared with single indexing? 2) If ’Word Segmentation‘ is the problem, how to deal it? 3) Can I use multi nodes of ES clust

Re: Is there any way to get the total number of buckets a aggs generated ?

2014-09-01 Thread panfei
Though this is not the solution, thanks for your advice. 2014-09-02 11:31 GMT+08:00 vineeth mohan : > Hello , > > Couple of options here. > Add the count aggregation - > http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-metrics-valuecount-aggregation.html

Re: Is there any way to get the total number of buckets a aggs generated ?

2014-09-01 Thread vineeth mohan
Hello , Couple of options here. Add the count aggregation - http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-metrics-valuecount-aggregation.html { "aggregations": { "aggs": { "value_count": { "field": "names" } } } } Or , s

Transport Client connectedNodes() duplicates

2014-09-01 Thread Stefan Will
Hi, for testing purposes, I've started up a stand-alone Elasticsearch node on my laptop, and am using the transport client to connect to it. When I initialize the client using "sniff=true", and then print out the list of connected nodes, as follows: TransportClient client = new TransportCl

Re: Is there any way to get the total number of buckets a aggs generated ?

2014-09-01 Thread panfei
Hi vineeth, thanks for the tips, but I really want to know the number of the buckets generated by the aggs. a bucket_count's value (like the doc_count's value) in the response. 2014-09-02 10:00 GMT+08:00 vineeth mohan : > Hello , > > Setting search_type to count avoids executing the fetch phas

Re: Is there any way to get the total number of buckets a aggs generated ?

2014-09-01 Thread vineeth mohan
Hello , Setting search_type to count avoids executing the fetch phase of the search making the request more efficient. See Search Type for more information on the search_type parameter. Thanks

Re: Is there any way to get the total number of buckets a aggs generated ?

2014-09-01 Thread panfei
To some degree, I can archive this goal by setting size: 0 in the terms aggs to list all the generated buckets, but it really consumes much memory, so is there any internal API to do this job ? Thanks. 2014-09-02 9:38 GMT+08:00 panfei : > we do a aggregation, and we only want to get the number

refresh and flush time reference in the cluster or in the node or in the index or in the other?

2014-09-01 Thread fiefdx yang
I know ES execute refresh every second at the default configuration. I know ES execute flush every 30 minutes or trigger by translog at the default configuration. But I do not know who give the time reference and it is what level in the ES framework? -- You received this message because you are

Is there any way to get the total number of buckets a aggs generated ?

2014-09-01 Thread panfei
we do a aggregation, and we only want to get the number of buckets it generated, is there any way to archive this goal ? Thanks -- 不学习,不知道 -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emai

Re: phrase suggester's sort mode

2014-09-01 Thread Nikolas Everett
Sounds like a useful feature then! I'll certainly review it if you send a pull request though I can't merge stuff. On Sep 1, 2014 1:57 PM, "Heval Azizoğlu" wrote: > Yeah i know that actually currently i am manipulating term suggester to > get what i want but this feature is quite basic, i think,

Re: Need Help: Upgrade of ES + Large queries = new CPU overload

2014-09-01 Thread Scott Decker
well, in case anyone wants to know, it was because we had _cache:true and _cache_key: items in our filter sets. basically because they are known filters that do not change. for some reason, having this set caused huge amounts of cpu usage. not sure what was happening behind the scenes, but this

aggregation of hierchical elements possible?

2014-09-01 Thread skippi1
The index has a field named "path" which contains the canonical file name, e.g.: /a/file1 /a/file2 /a/b/file3 Is it possible to create an bucket aggregation to summarize all file per path including subfolders? Something like that: /a => 3 files /a/b => 1 file regars, markus -- View this m

Re: phrase suggester's sort mode

2014-09-01 Thread Heval Azizoğlu
Yeah i know that actually currently i am manipulating term suggester to get what i want but this feature is quite basic, i think, that I thought maybe i am missing something and there is an easy way to do that. Thanks for the answer. On Sun, Aug 31, 2014 at 2:18 AM, Nikolas Everett wrote: > You

Re: how to use aggregations with filter?

2014-09-01 Thread Markus Breuer
Can someone see this post on mailinglist? I see "currently not accepted by mailing list", but I am subribed to it. -- View this message in context: http://elasticsearch-users.115913.n3.nabble.com/how-to-use-aggregations-with-filter-tp4062464p4062773.html Sent from the ElasticSearch Users mailin

Re: [Hadoop] Hadoop plugin indices with ':' colon - not able to snapshot (?)

2014-09-01 Thread Costin Leau
Hi, ':' has a special meaning in an URI, which is what HDFS uses. You basically have to either escape the character (%3A) or use a different character. Potentially you can rename the file to the desired name by running a separate command after the job has been completed. However, no URI/URL c

[Hadoop] Hadoop plugin indices with ':' colon - not able to snapshot (?)

2014-09-01 Thread Mateusz Kaczynski
Whenever I try to take a snapshot of an index which contains a ':' colon in its name,I end up with the following trace: { "error":"IllegalArgumentException[java.net.URISyntaxException: Relative path in absolute URI: crawl:1]; nested: URISyntaxException[Relative path in absolute URI: crawl:

[ANNOUNCEMENT] - spring-elasticsearch 1.3.0 released

2014-09-01 Thread David Pilato
I am pleased to announce the spring-elasticsearch-1.3.0 release! Spring factories for Elasticsearch Changes in this version include: Changes: o Update to Elasticsearch 1.3.0. Issue: https://github.com/dadoonet/spring-elasticsearch/issues/47. o Update to Spring 4.0.6.RELEASE. Issue: https:

Re: Design advice for ES side-by-side with hadoop cluster?

2014-09-01 Thread Costin Leau
On 9/1/14 4:51 PM, bob.web...@gmail.com wrote: Hi Guys, I have a 16 node hadoop cluster, running Cloudera's community edition. All 16 nodes are big powerful boxes with lots of disk. Can you provide some actual numbers? How much RAM per machine - how much is allocated to Hadoop, how much to E

Re: Date Aggregation Help

2014-09-01 Thread Simon Edwards
Thanks for all your help Vineeth On Monday, September 1, 2014 5:03:38 PM UTC+1, vineeth mohan wrote: > > Hello Simon , > > I hope you are familiar with Elasticsearch API. > I believe by dashboard you are meaning Kibana. > If that is the case , you wont be able to see this result there as Kibana >

Re: How can we integration of hive(hadoop) and elasticsearch

2014-09-01 Thread Costin Leau
Hi, Elasticsearch Hadoop might be of use - it's a connector that allows Hadoop jobs (Map/Reduce, Pig, Hive, Spark, Cascading) to communicate with Elasticsearch. The official home page is here [1] with links to the downloads and the docs. There was also a recent webinar held recently which you

Re: [Hadoop] Setting Document ID in Map Reduce Mapper

2014-09-01 Thread Costin Leau
Glad to hear the issue has been fixed - not sure how I've missed this email before. It was probably addressed in the way the parameters are handled [1] In the future, when encountering an unexpected behaviour/bug please file an issue on github as well to make sure it's not getting lost. Thank

Re: Date Aggregation Help

2014-09-01 Thread vineeth mohan
Hello Simon , I hope you are familiar with Elasticsearch API. I believe by dashboard you are meaning Kibana. If that is the case , you wont be able to see this result there as Kibana does not currently offers this level of flexibility. You will need to write JSON agg query based on - http://www.e

Re: Elasticsearch and Hive work together

2014-09-01 Thread Costin Leau
Actually they are completely different. Hive is a library built on top of Hadoop that uses a SQL-like query language to transform (mainly read) data. Elasticsearch is a real-time search and analytics engine. You can read the docs of each library/product to see the differences or better yet, take

Issues when a node comes back after reboot on Azure

2014-09-01 Thread satishmallik
Hi, We are trying to deploy elasticsearch for our production on Azure, I am using following settings, discovery.zen.ping.unicast.hosts: ["saku-es-m01","saku-es-m02","saku-es-m03"] discovery.zen.minimum_master_nodes: 2 node.tag: masternode http.enabled: false # Fixed settings path.d

Re: Elasticsearch-Hadoop repository plugin Cloudera Hadoop 2.0.0-cdh4.6.0

2014-09-01 Thread Mateusz Kaczynski
(Much delayed) thank you Costin. Indeed, on Ubuntu, changing ES_CLASSPATH to include hadoop and hadoop/lib directories in /etc/default/elasticsearch (and exporting it in /etc/init.d/elasticsearch) and installing light plugin version did work. -- You received this message because you are subscr

Re: Elasticsearch-Hadoop repository plugin Cloudera Hadoop 2.0.0-cdh4.6.0

2014-09-01 Thread Mateusz Kaczynski
(Much delayed) thank you Costin. Indeed, on Ubuntu, changing ES_CLASSPATH to include hadoop and hadoop/lib directories in /etc/default/elasticsearch (and exporting it in /etc/init.d/elasticsearch) and installing light plugin version did work. On Thursday, 14 August 2014 20:59:39 UTC, Costin Le

Re: Date Aggregation Help

2014-09-01 Thread Simon Edwards
Hello Vineeth, My question was regarding where to set up the date histograms. Do i simply add a section to the dashboard JSON or do I need to update the index field mappings? If you could provide a small example that'd be great. Thanks for all your help! On Monday, September 1, 2014 4:12:14 P

Re: _ttl inherited from index settings

2014-09-01 Thread David Pilato
You can do that I guess using index templates:  http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-templates.html#indices-templates --  David Pilato | Technical Advocate | Elasticsearch.com @dadoonet | @elasticsearchfr Le 1 septembre 2014 à 16:58:32, Octavian (octavian

Re: Can I find a list of indices containing documents which contain a specified term?

2014-09-01 Thread vineeth mohan
Hello Chris , I am using ES 1.3.1 Can you give it a try on that version. Thanks Vineeth On Mon, Sep 1, 2014 at 6:59 PM, Chris Lees wrote: > > Same result I'm afraid... > > { > "took": 62, > "timed_out": false, > "_shards": { > "total": 147, > "successful":

Re: Date Aggregation Help

2014-09-01 Thread vineeth mohan
Hello Simon , I didn't quite understand your question. Kindly elaborate. Thanks Vineeth On Mon, Sep 1, 2014 at 7:59 PM, Simon Edwards wrote: > Hello Vineeth, > > Many thanks for your reply, sounds like your method should work a treat! > > One quick noob question, where abouts are th

_ttl inherited from index settings

2014-09-01 Thread Octavian
Hello, Is there any setting that specifies the _ttl for all documents from a index? I saw that there is a setting per type, that works[1], but when I try to put the same setting per index, it doesn't work. [1]http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-ttl-fiel

Need Help: Upgrade of ES + Large queries = new CPU overload

2014-09-01 Thread Scott Decker
Hey all, We have been testing the new 1.3.1 release on our current load and queries, and have found that under same conditions, same queries, the es cluster we have just starts to max out cpu and the thread pools fill up and the query times just keep going up until eventually we have the resta

Re: Date Aggregation Help

2014-09-01 Thread Simon Edwards
Hello Vineeth, Many thanks for your reply, sounds like your method should work a treat! One quick noob question, where abouts are the date histograms aggregations created? Are they created on the index or on the dashboard itself? Any help is appreciated. Cheers, Si. On Monday, September 1, 20

Design advice for ES side-by-side with hadoop cluster?

2014-09-01 Thread bob . webman
Hi Guys, I have a 16 node hadoop cluster, running Cloudera's community edition. All 16 nodes are big powerful boxes with lots of disk. I would like to add ES to this cluster, but would like advice on how to configure/design the ES cluster. I bulk load my data using PIG, which means Map-Reduce.

How can we integration of hive(hadoop) and elasticsearch

2014-09-01 Thread Mohit Kumar Yadav
Hi folks, I am trying to use hive concept with elasticsearch. I have successful install and run both hadoop(running on window machine using vmware player) and elasticsearch But I have no clue how I can integrate both thing. Please can you provide me any link where I can get steps to integrate both

Re: Can I find a list of indices containing documents which contain a specified term?

2014-09-01 Thread Chris Lees
Same result I'm afraid... { "took": 62, "timed_out": false, "_shards": { "total": 147, "successful": 144, "failed": 0 }, "hits": { "total": 9975671, "max_score": 1.0, "hits": [ ... results removed as data is sensitiv

Re: Can I find a list of indices containing documents which contain a specified term?

2014-09-01 Thread vineeth mohan
Hello Chris , That is strange , its working fine on my side. Can you run the below and paste the result - curl -XPOST 'http://localhost:9200/_search' -d '{ "aggregations": { "aggs": { "terms": { "field": "_index" } } } }' Thanks Vineeth On Mon, Sep 1,

how to get trip length out of a series of gps coordinates

2014-09-01 Thread M R
hi all, what would be the smartes approach to calculate the distance traveled for a (filtered) series of coordinates? i´m wondering what tools could help out there... somewhat like the sum aggregation of a geopoint field respecting the result order? i hope somebody can help me out. or is it

Re: [Hadoop] Setting Document ID in Map Reduce Mapper

2014-09-01 Thread Juan Carlos Fernández
I had the same issue and it was solved using es-hadoop 2.0.1 instead 2.0.0. Looks like a solved bug but I couldn't find anyone claiming it like an open bug neither closed. Regards El martes, 3 de junio de 2014 15:52:21 UTC+2, Daniel Tardón escribió: > > Hi all, > > I'm newbie with ES and i'm try

Re: Date Aggregation Help

2014-09-01 Thread vineeth mohan
Hello Simon , I believe this can be done in this manner. Do 2 separate date histogram on the date_submitted field and date_closed field. The sum of count of date_submitted minus the sum of count on date_closed on all the previous date should give you the number of open issues for that week. For

Re: Can I find a list of indices containing documents which contain a specified term?

2014-09-01 Thread Chris Lees
Thanks Vineeth. Unfortunately it doesn't return any results in the aggregations result. Input query: GET _search { "aggregations": { "aggs": { "terms": { "field": "_index" } } } } Result JSON showing 26K hits (correct), but no index aggregations: { "took": 4,

Re: Can I find a list of indices containing documents which contain a specified term?

2014-09-01 Thread vineeth mohan
Hello Chris , This should work - { "query" : { // GIVE QUERY HERE }, "aggregations": { "aggs": { "terms": { "field": "_index" } } } } Thanks Vineeth On Mon, Sep 1, 2014 at 3:10 PM, Chris Lees wrote: > > I'm building a simple app which presents the us

Date Aggregation Help

2014-09-01 Thread Simon Edwards
Hi, I was wondering if somebody familiar with aggregations, particularly date histogram aggregations, can point me in the right direction. I'm currently looking to get a total count of records over a specific time period. Each record contains a "date_submitted" field and if they're closed, con

Re: Determine Shard Id based on routing key

2014-09-01 Thread Adrien Grand
On Mon, Sep 1, 2014 at 1:18 PM, 'Sandeep Ramesh Khanzode' via elasticsearch wrote: > However, I am a little concerned with your comment on the equivalence of 1 > index with 20 shards and 20 indices with one shard each. You mentioned that > you would discourage the latter. > > Can you please expla

Re: complexe query ( for me ;) ) with many match_all and range

2014-09-01 Thread alain ibrahim
EDIT 1 ; this seem to work for begining : curl -XGET 'localhost:9200/donnees/specimens/_search?pretty=true' -d '{ "fields" : ["E_EVENTID", "O_CATALOGNUMBER", "O_RECORDNUMBER"], "query" : { "bool" : { "must" : [ { "query_string":{"que

Re: Determine Shard Id based on routing key

2014-09-01 Thread 'Sandeep Ramesh Khanzode' via elasticsearch
Hi Adrian, Thanks for the reply. That was important for me to understand. However, I am a little concerned with your comment on the equivalence of 1 index with 20 shards and 20 indices with one shard each. You mentioned that you would discourage the latter. Can you please explain why? Is it fo

Re: EL setup for fulltext search

2014-09-01 Thread Marc
Hi Ivan, Using a test index and the analyze API, I was no able to create a config, which is fine for me... theoretically. { "template": "logstash-*", "settings": { "analysis": { "filter": { "my_word_delimiter": { "type": "word_delim

Re: EL setup for fulltext search

2014-09-01 Thread Marc
Hi Ivan, Using a test index and the analyze API, I was no able to create a config, which is fine for me... theoretically. { "template": "logstash-*", "settings": { "analysis": { "filter": { "my_word_delimiter": { "type": "word_delim

Re: EL setup for fulltext search

2014-09-01 Thread Marc
Hi Ivan, Using a test index and the analyze API, I was no able to create a config, which is fine for me... theoretically. { "template": "logstash-*", "settings": { "analysis": { "filter": { "my_word_delimiter": { "type": "word_delimi

The optimal way to aggregate in Kibana information from multiple Elasticsearch indexes

2014-09-01 Thread Vagif Abilov
I originally posted this question on StackOverflow, but I see that this group might be a more suitable place for it. We are setting up logs from several related applications so the log events are imported into Elasticsearch (via Logstash). It was straightforward create Kibana dashboards to visu

How to use copy_to on a nested field

2014-09-01 Thread Yann Moisan
Is it possible to use copy_to in a nested field : Here is an extract of my mapping : day: { type: nested properties: { weight: { index_name: bzixtz2fng.day.weight type: double } value: { index_name: bzixtz2fng.day.value

complexe query ( for me ;) ) with many match_all and range

2014-09-01 Thread alain ibrahim
Hello I'm quite new on elasticsearch. I made a form who query an elasticsearch document for getting the datas. In the form there is mutiple input : image_yes : checkbox NAME : string COLLECTION : string CATALOGNUMBER:string RECORDNUMBER: string LOCALISATION: string E

Can I find a list of indices containing documents which contain a specified term?

2014-09-01 Thread Chris Lees
I'm building a simple app which presents the user with two drop-downs to easily filter data: one for day (mapping to my daily indices), and one for client (a term within documents). I'm currently finding indices using curl -XGET localhost:9200/_aliases, and a simple aggregation query to get a

Kibana Line charts Without timestamp field

2014-09-01 Thread srinu konda
Hi , Am trying to display some date on kibana, my requirement is to build a Line chart without timestamp field, Is that option is there in kibana ?, I have to take some numerical values on x-axis and y-axis to display Line chart... Please help me. Thanks, Srinivas. -- You received this messa

Function_Score Filter on _ttl

2014-09-01 Thread Fabian Köstring
Hey there! I have a function_score query and i try to get only documents with a ttl greater than or equals a specific value. But it seems that the query doesnt work. I dont get the expected results. GET index1/_search { "query": { "function_score": { "filter":{ "b

[ANN] Elasticsearch Mapper Attachment plugin 2.3.2 released

2014-09-01 Thread Elasticsearch Team
Heya, We are pleased to announce the release of the Elasticsearch Mapper Attachment plugin, version 2.3.2. The mapper attachments plugin adds the attachment type to Elasticsearch using Apache Tika.. https://github.com/elasticsearch/elasticsearch-mapper-attachments/ Release Notes - elasticsea

Re: Elastic search : Quesry not executing

2014-09-01 Thread joergpra...@gmail.com
There is no limitation on fields in ES. Each field requires a bit of memory so the limit is only dependent on your hardware resources (RAM, CPU power). I run ~1000 fields if it is of any interest, without significant performance hit. Jörg On Mon, Sep 1, 2014 at 9:34 AM, ravi kumar wrote: > H

Re: Elastic search : Quesry not executing

2014-09-01 Thread vineeth mohan
Hello Ravi , Its not a good practice to add any number of fields , this will have a toll on the performance. Instead of storing it as laptopModelId:"Vostro" , Store it as "attributes" : [{ "key" : laptopModelId", "value" : "Vostro" }, { "key" ... "value" ... }] And then declare attributes

Re: Elastic search : Quesry not executing

2014-09-01 Thread ravi kumar
Hi Dixit , SO will it be ok to put 1000s fields inside the same _type. This is what i am worry about. Is there any docs that describe this kind of limitation -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and s

Re: Elastic search : Quesry not executing

2014-09-01 Thread ravi kumar
Hi Dixit , SO will it be ok to put 1000s fields inside the same _type. This is what i am worry about -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticse

Re: Elastic search : Quesry not executing

2014-09-01 Thread Bharvi Dixit
Hi Ravi, You need to change your schema & put all these fields under same type to get the output. And as per my experience with elasticsearch ther won't be any problem with large the number of fields. Regards Bharvi On Monday, 1 September 2014 12:25:36 UTC+5:30, ravi kumar wrote: > > Hi david

Re: Elastic search : Quesry not executing

2014-09-01 Thread David Pilato
Yes. -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs > Le 1 sept. 2014 à 08:53, ravi kumar a écrit : > > Hi david , > > Your query is working but thats not what I want. What i want is to return > record of laptop that with laptopModelId "Vostro" AND sellAdState should be >