storage use by attachment plugin

2014-06-20 Thread Izam Fahmi Alias
Dear All, i have about 20GB size of document, and i want to index all the document content using attachment plugin, my question is, what is the size of the index, is't the size will be also 20gb thank you -- You received this message because you are subscribed to the Google Groups

Re: Extremely slow indexing -- java throwing http excetion errors

2014-06-20 Thread Alexander Reelsen
Hey. judging from the exception this looks like an unstable network connection? Are you using persistent HTTP connections? Pinging the nodes by each other is not a problem I guess? --Alex On Thu, Jun 19, 2014 at 12:12 AM, alekjouhar...@gmail.com wrote: Hello all, So here's the issue, our

Re: Splunk vs. Elastic search performance?

2014-06-20 Thread joergpra...@gmail.com
It is correct you noted that Elasticsearch comes with developer settings - that is exactly what a packages ES is meant for. If you find issues when configuring and setting up ES for critical use, it would be nice to post your issues so others can also find help too, and maybe share their

Re: Clarification on has_child filter memory requirements

2014-06-20 Thread Alexander Reelsen
Hey, not all parent documents (and not the data), just their ids. Still this can accumulate, which is the reason why you should monitor the size of that data structure (exposed in the nodes stats). Hope that helps. --Alex On Thu, Jun 19, 2014 at 6:03 AM, Drew Kutcharian d...@venarc.com

Re: problem indexing with my analyzer

2014-06-20 Thread Tanguy Bernard
Information My note_source contain picture (.jpg, .png ...) in base64 and text. For my mapping I have used : type = string analyzer = reuteurs (the name of my analyzer) Any idea ? Le jeudi 19 juin 2014 17:57:46 UTC+2, Tanguy Bernard a écrit : Hello I have some issue, when I index a

Re: Losing data after Elasticsearch restart

2014-06-20 Thread Alexander Reelsen
Hey, the exception you showed, can possibly happen, when you remove an alias. However you mentioned NullPointerException in your first post, which is not contained in the stacktrace, so it seems, that one is still missing. Also, please retry with a newer version of Elasticsearch. --Alex On

Re: Count request does not support [filter]. Why?

2014-06-20 Thread Alexander Reelsen
Hey, not a hundred percent sure, what you mean here. The post_filter setting? There are two possibilities: Either use the search_type=count or use a filtered query in the count API. See http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-count.html

Re: Very frequent ES OOM's potential segment merge problems

2014-06-20 Thread Alexander Reelsen
Hey, can you provide more information about the OOM exception? Also you should use the nodes stats API to monitor your system, so you can maybe easily spot, where this memory consumption stems from. Also, are you just indexing or doing searches/queries/gets as well? --Alex On Thu, Jun 19,

Re: ElasticSearch Node.Client Options

2014-06-20 Thread Alexander Reelsen
Hey, a client node with a full 10gb heap and garbage collection does not free anything, so those objects are still in use (which clearly explains THAT the OOM happens, but not WHY). Do you have huge searches going on spanning a lot of shards with deep pagination (all the time). Do you have some

Re: puppet-elasticsearch options

2014-06-20 Thread Richard Pijnenburg
Hi Andrej, Thank you for using the puppet module :-) The 'port' and 'discovery minimum' settings are both configuration settings for the elasticsearch.yml file. You can set those in the 'config' option variable, for example: elasticsearch::instance { 'instancename': config = { 'http.port' =

effiecient way to store the result of a large slow query

2014-06-20 Thread Chen Wang
Hi guys, Just wondering what is the most efficient way of executing a query that takes time(parent/child documents) and returns large amount of entries, and store the result in randomly evenly divided block to HDFS? e.g, the query will return 100million records and I want every random 1million

Re: Type Ahead feature for contact list

2014-06-20 Thread Omi60
Thanks for the help. I am able to see the correct results now, but could you please suggest how to write following query in java curl -X POST localhost:9200/hotels/_suggest -d ' { hotels : { text : m, completion : { field : name_suggest } } }' -- View this message in

Re: Storing auto generated _id under different name

2014-06-20 Thread Johny Lam
I'm using elasticsearch as the database for a service. It would make things easier. For example, I could just return the _source field when other apps query my service. Related to that is that on the javascript client side, I am inserting the _id field into the _source JSON object as id and

Combine elasticsearch/logstash/kibana with hadoop

2014-06-20 Thread kay rus
Hi For performance improvement I'm trying to combine elasticsearch/logstash/kibana with hadoop (cdh4). Unfortunately I'm familiar only with HDFS where I store logs. In my opinion the combination of elasticsearch and hadoop should use hdfs as storage and transparent hadoop map/reduce

Re: Snapshot Restore in a cluster of two nodes

2014-06-20 Thread Alexander Reelsen
Hey, can you be more precise and create a fully fledged example (generating the repository, executing the snapshot on cluster one, executing restore on cluster 2, etc) and include the concrete error message in order to find out what 'the process breaks' means here? Also provide info about

Re: How does shingle filter work on match_phrase in query phase?

2014-06-20 Thread Cédric Hourcade
Hello, Let's say you have an indexed text t1 t3 t3 with shingles. The token positions are also indexed, so you get : t1 (at pos 1), t1 t2 (pos 1), t2 (pos 2), t2 t3 (pos 2) and t3 (pos 3). So if you are searching with a match_phrase for t1 t2 t3 (even if not tokenized as shingles) it will

Re: Very frequent ES OOM's potential segment merge problems

2014-06-20 Thread Paul Sabou
java.lang.IllegalStateException: this writer hit an OutOfMemoryError; cannot complete merge at org.apache.lucene.index.IndexWriter.commitMerge(IndexWriter.java:3546) at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4272) at

Re: 100% CPU on 1 Node with JMeter Tests

2014-06-20 Thread Cédric Hourcade
Hello, It wouldn't surprise me if both Black Mamba and Slapstick were hitting 100%, they have more shards and have to handle more requests than the others nodes. But in your case it's only one node. First, are you http requests evenly spread over the 4 nodes? You could also check that all your

Re: Count request does not support [filter]. Why?

2014-06-20 Thread Andrew Gaydenko
Sorry, I wasn't clear enough. I mean Java client's CountRequest.source()'s argument content, { filter: ... } in particular. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send

Re: problem indexing with my analyzer

2014-06-20 Thread Cédric Hourcade
Does it mean your applying the reuters analyzer on your base64 encoded pictures? I guess it generates a really huge number of tokens for each entry because of your nGram filter (with a max at 250). Cédric Hourcade c...@wal.fr On Fri, Jun 20, 2014 at 9:09 AM, Tanguy Bernard

ElasticSearch queries always return all the datas stored in the index

2014-06-20 Thread Alexandre Touret
hello, https://stackoverflow.com/questions/24323480/elasticsearch-queries-always-return-all-the-datas-stored-in-the-index# I'm trying to index and query an index store in ES 1.2. I both create and populate the index with the JAVA API using the transportclient api. I have the following

How to set the query resultset size to infinite

2014-06-20 Thread Nuno Carvalho
Hi all, I just joined the mailing list, so sorry if this topic was discussed before. I would like to set the query size to infinite (or no limit). http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-from-size.html This page explains what the parameters do, but

Re: ElasticSearch queries always return all the datas stored in the index

2014-06-20 Thread David Pilato
Hey Alexandre, This is correct. You are searching for a carte which contains an adherent. Elasticsearch gives you a carte object as an answer. And elasticsearch gives you back exactly what you have indexed. That being said, I think you could look at parent/child feature for that use case. Or

Re: ElasticSearch queries always return all the datas stored in the index

2014-06-20 Thread Alexandre Touret
Hello, thanks for your response When I add an other carte put /tp/carte/20450813 { dateEdition: 2014-06-01T22:00:00.000Z, adherents: [ { birthday: 1963-03-22T23:00:00.000Z, firstname: FLORENCE,

Re: ElasticSearch queries always return all the datas stored in the index

2014-06-20 Thread David Pilato
Searching for DOE gives you that answer?  If so, it's not normal IMHO. You should try to reproduce it with a full SENSE script recreation so we can replay it and help you from here. See http://www.elasticsearch.org/help/ for information. About parent child, you could read this: 

Re: How to set the query resultset size to infinite

2014-06-20 Thread David Pilato
You don't want to do that! If your need is to extract (download) 1 000 000 000 records, you need to use scanscroll API:  http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/scan-scroll.html#scan-scroll --  David Pilato | Technical Advocate | Elasticsearch.com @dadoonet | 

Re: problem indexing with my analyzer

2014-06-20 Thread Tanguy Bernard
Yes, I am applying reuters on my document (compose by text and picture). My goal is to do my research on the text of the document with any word or part of a word. Yes the problem it's my nGram filter. How do I solve this problem ? Deacrease nGram max ? Change Analyzer by an other but who

Re: ElasticSearch queries always return all the datas stored in the index

2014-06-20 Thread Alexandre Touret
Yes My request for doe always return that answer Le vendredi 20 juin 2014 11:24:33 UTC+2, David Pilato a écrit : Searching for DOE gives you that answer? If so, it's not normal IMHO. You should try to reproduce it with a full SENSE script recreation so we can replay it and help you from

Re: ElasticSearch queries always return all the datas stored in the index

2014-06-20 Thread Cédric Hourcade
It looks like you are doing a GET rather than a POST, if so your query content is ignored. Cédric Hourcade c...@wal.fr On Fri, Jun 20, 2014 at 11:26 AM, Alexandre Touret alexan...@touret.info wrote: Yes My request for doe always return that answer Le vendredi 20 juin 2014 11:24:33

Re: ElasticSearch queries always return all the datas stored in the index

2014-06-20 Thread Alexandre Touret
That's right Thanks for your help :) Regards Le vendredi 20 juin 2014 11:28:26 UTC+2, Cédric Hourcade a écrit : It looks like you are doing a GET rather than a POST, if so your query content is ignored. Cédric Hourcade c...@wal.fr javascript: On Fri, Jun 20, 2014 at 11:26 AM,

Re: ElasticSearch queries always return all the datas stored in the index

2014-06-20 Thread David Pilato
No. GET works for running searches. It could be an issue if you are using an OLD SENSE version and not Marvel. --  David Pilato | Technical Advocate | Elasticsearch.com @dadoonet | @elasticsearchfr Le 20 juin 2014 à 11:28:23, Cédric Hourcade (c...@wal.fr) a écrit: It looks like you are doing

Re: problem indexing with my analyzer

2014-06-20 Thread Tanguy Bernard
I set max_gram=20. It's better but at the end I have this many times : [2014-06-20 11:42:14,201][WARN ][monitor.jvm ] [ik-test2] [gc][young][528][263] duration [2s], collections [1]/[2.1s], total [2s]/[43.9s], memory [536mb]-[580.2mb]/[1015.6mb], all_pools {[young]

Re: ElasticSearch queries always return all the datas stored in the index

2014-06-20 Thread Alexandre Touret
I just upgraded to ES 1.2.1 and the latest release of mavel. I have the same behaviour Le vendredi 20 juin 2014 11:34:59 UTC+2, David Pilato a écrit : No. GET works for running searches. It could be an issue if you are using an OLD SENSE version and not Marvel. -- *David Pilato* |

Re: How to set the query resultset size to infinite

2014-06-20 Thread Nuno Carvalho
Right... that makes sense :) I'll give it a try, thank you! Nuno On Friday, 20 June 2014 10:26:07 UTC+1, David Pilato wrote: You don't want to do that! If your need is to extract (download) 1 000 000 000 records, you need to use scanscroll API:

Re: problem indexing with my analyzer

2014-06-20 Thread Tanguy Bernard
The user copy/paste the content of an html page and me, I index this information. I take the entire document with image. I can't change this behavior. I set max_gram=20. It's better but at the end I have this many times : [2014-06-20 11:42:14,201][WARN ][monitor.jvm ] [ik-test2]

Re: ElasticSearch queries always return all the datas stored in the index

2014-06-20 Thread Cédric Hourcade
Ah yes sorry you are right, I am using some old tools :) Cédric Hourcade c...@wal.fr On Fri, Jun 20, 2014 at 11:49 AM, Alexandre Touret alexan...@touret.info wrote: I just upgraded to ES 1.2.1 and the latest release of mavel. I have the same behaviour Le vendredi 20 juin 2014 11:34:59

Re: How does shingle filter work on match_phrase in query phase?

2014-06-20 Thread 陳智清
Hello Hourcade, Thanks for your response. Does that mean different values should be set to index_analyzer and search_analyzer? (e.g. index_analyzer: shingle, and search_analyzer: standard) What if I want to re-use the same shingle analyzer in both index and search? will the match_phrase t1 t2

Re: How do people typically handle shard failures in their results?

2014-06-20 Thread Shay Banon
If it fails on the primary shard, then a failure is returned. If it worked, and a replica failed, then that replica is deemed a failed replica, and will get allocated somewhere else in the cluster. Maybe an example of where a failure on all shards would help here? On Jun 18, 2014, at 11:45,

Re: How do people typically handle shard failures in their results?

2014-06-20 Thread Nikolas Everett
On Fri, Jun 20, 2014 at 7:08 AM, Shay Banon kim...@gmail.com wrote: If it fails on the primary shard, then a failure is returned. If it worked, and a replica failed, then that replica is deemed a failed replica, and will get allocated somewhere else in the cluster. Maybe an example of where a

Re: How does shingle filter work on match_phrase in query phase?

2014-06-20 Thread Cédric Hourcade
Yes, you can use two different analyzers. In your case what you can do is: - for the the indexation you apply a shingle filter. - for the query you also apply a shingle filter, but this time you disable the unigrams (output_unigrams: false), so it will only generate the shingles, in your case : t1

Re: How do people typically handle shard failures in their results?

2014-06-20 Thread Shay Banon
Ahh, I see. If its related to searches, then yea, the search response includes details about the total shards that the search was executed on, the successful shards, and failed shards. They are important to check to understand if one gets partial results. In the REST API, if there is a total

Re: How does shingle filter work on match_phrase in query phase?

2014-06-20 Thread 陳智清
I got it! Thank you! Cédric Hourcade於 2014年6月20日星期五UTC+8下午8時00分36秒寫道: Yes, you can use two different analyzers. In your case what you can do is: - for the the indexation you apply a shingle filter. - for the query you also apply a shingle filter, but this time you disable the unigrams

Re: problem indexing with my analyzer

2014-06-20 Thread Cédric Hourcade
If your base64 encodes are long, they are going to be splited in a lot of tokens by the standard tokenizer. Theses tokens are often going to be a lot longer than standard words, so your nGram filter will generate even more tokens, a lot more than with standard text. That may be your problem

Re: ES v1.1 continuous young gc pauses old gc, stops the world when old gc happens and splits cluster

2014-06-20 Thread Ankush Jhalani
Mike - The above sounds like happened due to machines sending too many indexing requests and merging unable to keep up pace. Usual suspects would be not enough cpu/disk speed bandwidth. This doesn't sound related to memory constraints posted in the original issue of this thread. Do you see

Re: Losing data after Elasticsearch restart

2014-06-20 Thread Rohit Jaiswal
Hi Alexander, Here is the stack trace for the NullpointerException - [23:24:38,929][DEBUG][action.bulk ] [Rasputin, Mikhail] [17f85dcb67b64a13bfef2be74595087e][0], node[a-eZTR9XRiWq-o0QmsM2aA], [P], s[STARTED]: Failed to execute

Re: problem indexing with my analyzer

2014-06-20 Thread Tanguy Bernard
Thank you Cédric Hourcade ! Le vendredi 20 juin 2014 15:32:29 UTC+2, Cédric Hourcade a écrit : If your base64 encodes are long, they are going to be splited in a lot of tokens by the standard tokenizer. Theses tokens are often going to be a lot longer than standard words, so your nGram

guarding from double-start

2014-06-20 Thread Andrew Gaydenko
There were a couple of times during development workflow I have started ES script the second time. It results in red status (I use Elastic HQ) and not-working. So I'm forced to regenerate all indexes (with all test data) again. It takes noticeable time. At the moment I use this script

Re: searching on nested docs - geting back the nested docs as a response

2014-06-20 Thread liorg
I am not sure highlight will work as i suspect it will encounter the same obstacle, see in: https://github.com/elasticsearch/elasticsearch/issues/5245 as for suggestion #2, this will break our current schema and will require a significant model change (we store the data in MongoDB as well) -

Kibana Terms panel showing date fields as longs?

2014-06-20 Thread Chris Neal
Hello :) I have some log data indexed in ES and trying to visualize in Kibana and getting strange behavior related to dates. I have Terms panel with the following settings: Terms mode: terms Field: date Length 10 Order: count For some reason, the date column in the panel is showing up as a

Result the number of matched terms for a given result.

2014-06-20 Thread Dan Harvey
Hi, Is it possible to get elasticsearch to return the number of terms matched per result in a query. I know these are evaluated as they make up the score but there doesn't seem to be a way to get a simple count? For example with :query = {:in = {:user_ids = [user_ids...],

Re: guarding from double-start

2014-06-20 Thread Maciej Dziardziel
use start-stop-daemon or adapt /etc/init.d/elasticsearch to set up pidfile guarding es instance. Or just run this way: pgrep -f elasticsearch || ./start_es.sh On Friday, June 20, 2014 3:21:08 PM UTC+1, Andrew Gaydenko wrote: There were a couple of times during development workflow I have

Re: guarding from double-start

2014-06-20 Thread Ivan Brusic
You can either use the startup scripts that come with the package when you install via apt/yum [1] or use the service wrapper [2]. [1] http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/setup-repositories.html [2] https://github.com/elasticsearch/elasticsearch-servicewrapper

Re: guarding from double-start

2014-06-20 Thread Andrew Gaydenko
On Friday, June 20, 2014 6:49:04 PM UTC+4, Maciej Dziardziel wrote: use start-stop-daemon or adapt /etc/init.d/elasticsearch to set up pidfile guarding es instance. Or just run this way: pgrep -f elasticsearch || ./start_es.sh Aha, thanks! - at my case pgrep is the most appropriate. --

[ANN] Elasticsearch Thrift transport plugin 2.2.0 released

2014-06-20 Thread Elasticsearch Team
Heya, We are pleased to announce the release of the Elasticsearch Thrift transport plugin, version 2.2.0. The thrift transport plugin allows to use the REST interface over thrift on top of HTTP.. https://github.com/elasticsearch/elasticsearch-transport-thrift/ Release Notes -

Re: Splunk vs. Elastic search performance?

2014-06-20 Thread Brian
Thomas, Thanks for your insights and experiences. As I am someone who has explored and used ES for over a year but is relatively new to the ELK stack, your data points are extremely valuable. Let me offer some of my own views. Re: double the storage. I strongly recommend ELK users to disable

boolean multi-field silently ignored in 1.2.1

2014-06-20 Thread Bruce Ritchie
I'm seeing multi-fields of type boolean silently being reduced to a normal boolean field in 1.2.1 which wasn't the behavior in 0.90.9. See https://gist.github.com/Omega359/0c2a93690b4db30693a1 for an example of this. Is this expected? To me it seems like it should work - the boolean field

Re: Penalty or boost from a boolean property

2014-06-20 Thread David Pilato
Function_score is the way to go IMHO. Best -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 20 juin 2014 à 19:50, hugo lassiege hlassi...@gmail.com a écrit : Hi, I'm looking for help :) This is maybe trivial but I can't find the good solution. I have some

Penalty or boost from a boolean property

2014-06-20 Thread hugo lassiege
Hi, I'm looking for help :) This is maybe trivial but I can't find the good solution. I have some documents and those documents have two boolean properties, basically thumbs up and thumbs down to show that the administrator approve or not those documents. I try to boost a document if it is

Getting complete value from ElasticSearch query

2014-06-20 Thread Vinay Pandey
I have the following structure on my ElasticSearch: { _index: 3_exposureindex _type: exposuresearch _id: 12738 _version: 4 _score: 1 _source: { Name: test2_update Description: CreateUserId: 8

Re: Getting complete value from ElasticSearch query

2014-06-20 Thread Vinay Pandey
I forgot to mention that I have asked the same question in StackOverflow http://stackoverflow.com/questions/24333655/getting-complete-value-from-elasticsearch-query On Friday, June 20, 2014 11:52:49 AM UTC-7, Vinay Pandey wrote: I have the following structure on my ElasticSearch: {

deleting documents that are missing fields

2014-06-20 Thread Jeff Dupont
I can easily query for documents that are missing a particular term field, however I'd like to free up that space and remove those documents. I've tried this with no luck: DELETE /my_index/pages/_search { filter : { missing : { field : sentences,

Re: Getting complete value from ElasticSearch query

2014-06-20 Thread Vinay Pandey
This just got answered: You should be able to specify _source in the fields Example: { fields: [ _parent, _source ], query: { terms: { Id: [ 12738 ] } }} On Friday, June 20, 2014 11:52:49 AM UTC-7, Vinay Pandey wrote: I have the following

Terms aggregation for multiple fields

2014-06-20 Thread Madhavan Ramachandran
Hi Team, I am new to elasticsearch and learning about the searchapi/queryapi in elasticsearch. I have a requirement to fetch the data from ES. My data is as below assume in a table format Prop-Name Type Use Place1 Sale Office Place2 LeaseOffice

Re: cassandra river plugin installation issue

2014-06-20 Thread Shams Haque
Hi, The issue was not with Hector API, issue has been fixed by using WITH COMPACT STORAGE when creating column families in Cassandra. Here i have posted it: http://stackoverflow.com/questions/21089453/cassandra-column-name-trailing-with-blank-characters -- You received this message because

Re: deleting documents that are missing fields

2014-06-20 Thread Ivan Brusic
I do not use delete by query, but have you tried using a fully formed query and not just a filter? Perhaps an implicit match_all query is not being set. Try using a filtered query with a match_all query and your filter.

Re: get rid of _all to optimize storage and perfs (Re: Splunk vs. Elastic search performance?)

2014-06-20 Thread Brian
Patrick, Here's my template, along with where the _all field is disabled. You may wish to add this setting to your own template, and then also add the index setting to ignore malformed data (if someone's log entry occasionally slips in null or no-data instead of the usual numeric value): {

Re: HIVE-Elasticsearch [mapr-elasticsearch] write to elasticsearch issue

2014-06-20 Thread shankarramshivram
Hi Costin, Thanks for the tip. I replaced the old version of jackson and it works now :). Cheers Shankar On Sunday, June 15, 2014 3:09:27 AM UTC-6, Costin Leau wrote: What version of MapR are you using? MapR uses an old version of jackson which es-hadoop should detect and use an

issues with file input from logstash to elastic - please read

2014-06-20 Thread Eitan Vesely
Guys, its been more than a week i've been struggling with this issue, if possible, please give it a look and try to help :-( i have a config file that im running logstash with which is suppose to fetch the log file i specified in it and stream it to elasticsearch. problem is that it worked

Disabling date detection [Hive-Elasticsearch]

2014-06-20 Thread shankarramshivram
Hi , My write to es from mapr fails because of the automatic date detection being enabled . Is there a way to disable date detection from the external hive table properties. ? Request to please guide me regarding this. -- You received this message because you are subscribed to the Google

Re: boolean multi-field silently ignored in 1.2.1

2014-06-20 Thread Clinton Gormley
heya bruce that looks like a bug - please open an issue clint On 20 June 2014 19:41, Bruce Ritchie bruce.ritc...@gmail.com wrote: I'm seeing multi-fields of type boolean silently being reduced to a normal boolean field in 1.2.1 which wasn't the behavior in 0.90.9. See

Re: issues with file input from logstash to elastic - please read

2014-06-20 Thread Mark Walkom
You'll have better luck sending this to the Logstash mailing list :) Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 21 June 2014 08:02, Eitan Vesely eitan...@gmail.com wrote: Guys, its been more than a week i've

Re: How to find the number of authors who have written between 2-3 books?

2014-06-20 Thread Clinton Gormley
Alternatively, if you mode this with parent-child, then you can use min_children/max_children which is available in the next release http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-has-child-filter.html#_min_max_children_2 clint On 20 June 2014 17:15, Mike

Re: Splunk vs. Elastic search performance?

2014-06-20 Thread Mark Walkom
I wasn't aware that the elasticsearch_http output wasn't recommended? When I spoke to a few of the ELK devs a few months ago, they indicated that there was minimal performance difference, at the greater benefit of not being locked to specific LS+ES versioning. Regards, Mark Walkom Infrastructure

Re: guarding from double-start

2014-06-20 Thread Clinton Gormley
And in your config file, set: node.max_local_storage_nodes: 1 that way you won't start two nodes on a single instance On 20 June 2014 16:54, Andrew Gaydenko andrew.gayde...@gmail.com wrote: On Friday, June 20, 2014 6:49:04 PM UTC+4, Maciej Dziardziel wrote: use start-stop-daemon or

Re: problem indexing with my analyzer

2014-06-20 Thread Clinton Gormley
You seriously don't want 3..250 length ngrams That's ENORMOUS Typically set min/max to 3 or 4, and that's it http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_ngrams_for_partial_matching.html#_ngrams_for_partial_matching On 20 June 2014 16:05, Tanguy Bernard

Adding order to a terms aggregator results in ArrayIndexOutOfBoundsException

2014-06-20 Thread debo
I have a simple document schema on which I am trying to run the following query : curl -XPOST 'localhost:9200/indexName/topn/_search?pretty' -d '{ aggregations : { applid : { terms : { field : applid, size : 3, order : { ttbyt_sum : desc }

Re: guarding from double-start

2014-06-20 Thread Andrew Gaydenko
On Saturday, June 21, 2014 2:33:28 AM UTC+4, Clinton Gormley wrote: And in your config file, set: node.max_local_storage_nodes: 1 that way you won't start two nodes on a single instance Great, thanks! -- You received this message because you are subscribed to the Google Groups

Re: Splunk vs. Elastic search performance?

2014-06-20 Thread Brian
Mark, I've read one post (can't remember where) that the Node client was preferred, but have also read where the HTTP interface is minimal overhead. So yes, I am currently using logstash with the HTTP interface and it works fine. I also performed some experiments with clustering (not much,

Re: issues with file input from logstash to elastic - please read

2014-06-20 Thread Brian
Eitan, My recommendation is to use the stdin input in logstash and avoid its file input. Then, for testing you pipe the file into your logstash instance. But in production, you should run the GNU version of *tail -F* (uppercase F option) to correctly follow all forms of rotated logs, and the

Elasticsearch cluster on Azure using ubuntu. The nodes don't see each other

2014-06-20 Thread Pedro Alonso
I just posted this question on Stackoverflow: I have been setting up a cluster of Elasticsearch in Azure, using Ubuntu VM, following the tutorial on the plugin page (elasticsearch-cloud-azure) on github. I've managed to configure everything and I have elasticsearch running, but I have 3

update field type in existing mapping in elastic search

2014-06-20 Thread srikanth ramineni
Hi , can you please provide inputs to update the existing field type in the mapping.Below is the requirement. I have crated contractIndex and it is type is conract. In that i have fields contractid as long, contract number as long but i want to change contract number type as string.

Re: Elasticsearch cluster on Azure using ubuntu. The nodes don't see each other

2014-06-20 Thread David Pilato
You must create each VM under the same cloud service. azure vm create azure-elasticsearch-cluster Cloud service name is azure-elasticsearch-cluster -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 21 juin 2014 à 03:54, Pedro Alonso pedro@gmail.com a écrit : I just

Re: update field type in existing mapping in elastic search

2014-06-20 Thread David Pilato
You can't. You basically need to reindex. That said, you can try to use a multifield which add a String representation of the same field. But old values (old docs) won't have this new field populated. HTH -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 21 juin 2014 à

Re: relation between snapshot restore and update_mapping

2014-06-20 Thread JoeZ99
I just discoverd these strange update_mapping loglines come from a completely unrelated thing, so please take this post as invalid and accept my apologies. On Thursday, June 19, 2014 1:21:32 PM UTC-4, JoeZ99 wrote: This is a somehow bizarre question. I really hope somebody jumps in, because

Re: Clarification on has_child filter memory requirements

2014-06-20 Thread Drew Kutcharian
Thanks Alex. What do you mean by not all parent documents (and not the data), just their ids what decides what which parent document ids get loaded? Also, this ids that get loaded are per query or they stay around longer? I ask because in our use case we're going to keep adding more and more