Message content based INDEX.
I am an infant as far as Logstash+Elasticsearch is concerned. Could someone help me out on the following: I need to create INDEX based on the content of ACTUAL MESSAGE part of a message forwarded to elasticsearch by Logstash. For example: 6 Jan 9 07:19:26 w2k8r233110 CEF:0|Trend Micro|Deep Security Manager|8.0.1310|173|Anti-Malware Quarantined File List Exported|3|src=10.100.33.110 suser=System target=10.100.33.111 msg=Installation/Upgrade of Anti-Malware Component on Agent Succeeded.\n\nPrevious Version: \nCurrent Version: {1}\n In this message, the actual message part is : CEF:0|Trend Micro|Deep Security Manager|8.0.1310|173|Anti-Malware Quarantined File List Exported|3|src=10.100.33.110 suser=System target=10.100.33.111 msg=Installation/Upgrade of Anti-Malware Component on Agent Succeeded.\n\nPrevious Version: \nCurrent Version: {1} I need to create index based on 10.100.33.110 so that all messages with this source comes in one document. Thank you in advance. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5c8193fd-5e44-4405-8b89-b4aaff5ea654%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Multicast Enabled?
Hey, try setting discovery.zen.ping.enabled to false, should help. --Alex On Fri, Jan 31, 2014 at 2:30 AM, Arun Rao arunrao.seat...@gmail.com wrote: Hello - I set the following property in elasticsearch.yml and restarted elastic search. *discovery.zen.ping.multicast.enabled: false* After restarting, i still see two UDP packets going out to the multicast address 224.2.2.4. How to disable this? please advise. Thanks! -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d4e80c73-894a-4406-b52d-57916dfa5daa%40googlegroups.com . For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGCwEM_mcaiXwsgJQgsPKEnUWABThC4EMtckvEmj9o28pa1t6w%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: elasticsearch 1.0.0.RC1 jar hosted on mavenrepository depends on corrupted jar ?
Hey, does this still happen? Just to ensure that no local corruption happened, can you delete your gradle cache? Any easy steps to reproduce? --Alex On Fri, Jan 31, 2014 at 2:50 AM, Maxime Nay maxime...@gmail.com wrote: Hi, Today I upgraded the dependencies of one of our projects to use elasticsearch 1.0.0.RC1, and for some reason, when building the project (using gradle), I was getting a corrupted jar. (impossible to use it or open it) I fixed this issue by excluding the modules 'lucene-spatial' and 'lucene-suggest' (both version 4.6.0) Am I the only one having this issue ? If it is the case, would you have any suggestion for me to fix it ? Thanks ! Maxime -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c1c2a962-2e70-4640-aa37-e0ea8fc3b890%40googlegroups.com . For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGCwEM-ZCnNTTN2vT%2BvHnDNmH%3DYZtN8biMTP6hne3O_4Ld2VVQ%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: design advice for market data
Hey, so index: month, type: symbol? Might make sense.. you could also use routing and use the symbol for this, to ensure that you only query one shard for one symbol in order to have faster queries. This presentation might be interesting for you regarding data flows http://www.elasticsearch.org/videos/big-data-search-and-analytics/ --Alex On Sat, Feb 1, 2014 at 9:27 PM, Bobby Richards bobby.richa...@gmail.comwrote: Wanting to get some advice on how to go about design. I have some currency market data and I get roughly 10 million events a week currently storing in postgres, it actually ends up being about 10 gigs, though I would like to work on getting this down obviously. The data is seldom queried but I have all of my other data in elastic search which I love. I am trying to determine the best way to store this. I would like to query by symbol and time and indexing by month so I can drop months whenever. i guess that would mean 'month/symbol/(unixtime for minute). I am far from a data guy, so I am looking for direction, thoughts, etc...is this even a good use case for elastic search? Thanks, Bobby -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/54f02434-37b8-4435-a846-8d20f7e9d723%40googlegroups.com . For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGCwEM-z%3DV7YhiVO9Qxq_3TKHeze9LpvWaCh4FJae5Z%3D-Q_-6w%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: load balance on heterogeneous nodes in a cluster
Hi, please have a look at shard allocation, where you can define, that specific indices should be put into your stronger or weaker boxes (or whatever criteria you define). This might be sufficient for a first try http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-allocation.html#index-modules-allocation If you need more sophisticated logic (based on CPU power or memory), you could write your own decider, but I think you can go a mile with shard allocation. This also applies to your second question, where you could put the indices, which are searched a lot, on the more powerful machines to make sure that indexing and querying is as fast as possible. Hope this helps... --Alex On Mon, Feb 3, 2014 at 8:44 AM, xzer LR xiao...@gmail.com wrote: I am now evaluating elasticsearch as our text search solution. But the problem is that we cannot guarantee that we can always allocate same hardware for our cluster when new nodes are added, therefor we need a solution to distribute the load in a smart way based on the machine power. I read the document and source, I found there is a BalancedShardsAllocator for balancing the shards between nodes with consideration of shards count. But basically, the BalancedShardsAllocator still considers the nodes in the cluster as homogeneous. It seems that we can implement our own ShardsAllocator to distribute shards by predefined machine factor(the simplest way maybe), I want to know whether there is something I missed or there is already some built-in function affording the ability we want? And I also have the related second question, currently our search is not IO-bound because we have big-enough memory on all of our machines but there are different counts of cpu cores in every machine, I want the client search can be distributed to nodes based on the count of cpu cores rather than simple round-robin. Is there any way to do that? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/35708394-19e7-4ab2-ab1a-f632039da26e%40googlegroups.com . For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGCwEM8oNLh3jVwDuTBaVnzgx-nJThWuEYjaQVr%2B%3DUpwb%3DiUGg%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
Histogram created in Kibana doesn't show anything
Hi, I'm trying to configure a histogram panel on Kibana 3 to show the count of a certain document type in elasticsearch over time. Each document has a 'date' field containing the creation time of the element as a timestamp (epoche + millisecons) saved as a long (e.g.: 1391175467000 -- Fri Jan 31 2014 14:37:47 GMT+0100 (CET)). The mapping of elasticsearch marks this field as a date but without format specified as timestamp format should be supported out of the box. When I create a histogram panel for the index containing the documents and set the 'date' field as the time field, then the histogram doesn't show what I expected it to show. It's actually empty and the date shown on the x-axis shows 01/01/70 which seems to me very wrong as none of the documents was created at these date. Any ideas what's going wrong here? Regards, Flo -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f760d6e3-701e-4ff6-bad4-63f213d034e8%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: What are good combinations of search analyzer, index analyzer and query for implementing an effective autocompleter using ElasticSearch ?
On 1 February 2014 20:19, joergpra...@gmail.com joergpra...@gmail.comwrote: No, suggester is not restricted to prefix, in 0.90.4 fuzziness was added, as documented. Fuzzy suggest completion means your query may contain errors within an edit distance. But, it is still a prefix suggester... You can't mix up the order of words. This works really well for well formulated names eg song titles. But for general search it can fail to match. it's worth using the completion suggester as the first-line search and falling back to edge ngrams if there are no matches. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPt3XKT5kfXoen%3DO2a%3D0yP9SuGjyvEh9E21ZLDc%2BZ_KMdyHuQA%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: load balance on heterogeneous nodes in a cluster
Right now, ES can control total number of shards on a node, but not a CPU strength factor. You have to write your own decider that is based on the criteria you gave. Note, the mere count of CPU core may not be reliable for an allocation decider, because it does not determine the total shard processing capacity - there can be very strong cores, or weak cores on different CPUs. Even if you write a CPU strength based decider, the weakest node will still determine the overall performance in query and index operations. That is you can not compensate weak CPU power by an allocation decider. Jörg -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoE2%3Dm%2Bg5Vhc%3DpskFyJsz9EVurx2x7-h2Pe3kozC44dPbA%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Elasticsearch Hadoop
There is no duplication per-se in HDFS. Hive tables are just 'views' of data - one sits unindexed, in raw format in HDFS the other one is indexed and analyzed in Elasticsearch. You can't combine the two since they are completely different things - one is a file-system, the other one is a search and analytics engine. On 09/01/2014 9:49 AM, Badal Mohapatra wrote: Hi, To index Hadoop data into elasticsearch as I understand, We create an external table with essstorage handler and then copy the data from another internal hive table doesn't it duplicate the data in HDFS? Is there any way to use the hive internal tables directly to index instead of having two tables with same data? Kind Regards, Badal -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ed08fd38-05e4-437a-a8e2-3295f2195e2a%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out. -- Costin -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/52EF730F.4060508%40gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: filtered alias and possible options for forcing filters for queries
Hi, I would go for option 1, it just means that while creating the alias, you need to provide a list of indices, and for each index you 'll need to provide a filter, which in your case will always be the same if I got it right. Does it make sense? On Friday, January 31, 2014 7:07:02 PM UTC+1, Matthias Johnson wrote: In our application we have the need to restrict access to data. We would like to use filters for that. The filtered aliashttp://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-aliases.html seems like a nice way to do that and my original thought was to simply create an alias for each user and then add a filter to it as needed. That would work quite nicely, however we split the content potentially across many indices (think logstash and it's default per day index). Sadly a filter alias can only point to one index (wildcards don't seem to work in the 'index' either), which I assume is due to making it also work for document posting and updates With that said, is there something obvious I'm missing to get the desired functionality of applying/forcing a filter dependent on a user? Right now I'm considering 2 choices: 1) create per user aliases for each index they should have access to and then running the _search against /user_*/. That seems like it quickly becomes hard to scale and manage 2) store the user filters in ES and having a shim that inserts them at query time. Right now I already use a shim to authenticate via an HTTP header token through an nginx proxy Any thoughts and opinions would be most welcome. \@matthias -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/93f636e7-882e-4261-92e1-fc13e8d0ff5c%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: [Hadoop] capability clarification questions
Hi, This section [1] of the Elasticsearch for Apache Hadoop reference tries to answer your questions. In other words, as oppose to a 'normal' client, es-hadoop 'parallelizes' your reads and writes so if you have a Hadoop job with 5 tasks running in parallel, you'll end up with 5 parallel writes to Es. In a similar vein, when you read data from Es you'll get parallel reads - so if your index has 5 shards, you'll end up with 5 different tasks streaming/reading data from Es. Regarding deployment, es-hadoop works against Apache Hadoop 1.x and 2.x and various other Hadoop distros. There's nothing extra that you have to do to your cluster; again this is covered in the reference docs here [2]. As for deployment, you can install Es on the same physical cluster as Hadoop or on a separate one; it's really up to you and your hardware. As long as you have spare RAM and CPU, you can co-locate the two (which es-hadoop will take advantage of) - in fact, you don't have to have the same amount of ES and Hadoop nodes, you can mix and match depending on your requirements. If I understand correctly, you already are reusing the same machine which is fine. I suggest taking a look at the docs and trying out the examples in it (which you can find in the readme as well) - things are easy to install and there's no extra provisioning that needs to be done. Cheers, [1] http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/arch.html [2] http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/features.html On 30/01/2014 9:55 PM, Josh Harrison wrote: In looking around I haven't been able to find explicit answers to these questions - though the questions may entirely be because I'm a hadoop newbie. If we were to deploy ES within a hadoop environment: The primary benefit is allowing direct interaction with ES from Hadoop, running queries or indexing data, is that right? Are there explicit benefits to search speed and capability when run through the normal REST or other client APIs? That is to say, if I have a set of N documents and a query that takes T seconds to run on a normal cluster through curl, would there be a marked improvement in T when running the same query through curl against a hadoop enabled cluster? Are the ideal architecture designs for a hadoop enabled ES cluster the same, or similar to, a regular cluster? If they're the same, does a hadoop enabled cluster need to be designed as such from the start, or can that functionality be tacked on to an already functioning cluster with data? Situation is, we're on a cluster of machines running hadoop, but the ES nodes are just running on the compute nodes like a regular service. Wondering what it would take to enable the hadoop capabilities. Thanks! -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4dea6c95-75b8-4ed7-a054-3f9eaedde9d3%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out. -- Costin -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/52EF74CA.5070202%40gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: EsRejectedExecutionException[rejected execution (queue capacity 50)
The new setting of bounded bulk queue is a precaution not to overwhelm the internal ES buffers. Unbounded queue size is potentially dangerous since your node operations may get stalled by too many bulk requests. You could increase bulk queue_size from 50 to 100 if that works better. But, the soundest method is to check your bulk indexing code for capping the number of concurrent bulk requests. Or you may increase the number of nodes/shards. Right now it seems your concurrency setting is too high for the capacity of your cluster (just a wild guess, without knowing your numbers). Jörg On Fri, Jan 31, 2014 at 6:48 AM, Ramchandra Phadake ramchandra.phad...@gmail.com wrote: Hi, Recently we migrated from 0.90.1 to 0.90.9. We did change index.codec.bloom.load=false initially but even with true we obsetve same behaviour. Nothing else has changed from indexing code. We are using bulk API. Any pointers why we can get EsRejectedExecutionException? This doesn't frequently in indexing process. We did try by setting with few thousand size but this exception comes but rarely. After setting bulk queue to -1 it appears to be working. -1 means unbounded size so is it safe to set to -1. Thanks, Ram -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/89c6e465-c23f-4234-9667-cd09be4a7178%40googlegroups.com . For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoExvP3W-6MiR%3DSG_c%2BR9kJz_koLLq8zOF_JML_axhrQbw%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Kibana not loading graphs
shoud be fixed with the following commits https://github.com/salyh/elasticsearch-security-plugin/commit/8301b02e374257040f5f56bd164a0ac10b835370 https://github.com/salyh/elasticsearch-security-plugin/commit/212c9fae027881bc0bb8b6028c2b42abc1579549 https://github.com/salyh/elasticsearch-security-plugin/commit/98e7980304205311def31f46a9b1291d4a500c50 Thank you Ram for your contributions KR Hendrik Am Montag, 27. Januar 2014 20:47:06 UTC+1 schrieb Kotamaraja, Ram: Hi, I am working in setting up Kibana and processing the requests through elastic security plugin: https://github.com/salyh/elasticsearch-security-plugin. When the plug in is enabled, I get the following response for OPTIONS query. Fiddler trace for Input: OPTIONS http://myservername:9200/myindex/_search HTTP/1.1 Host: myservername:9200 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:26.0) Gecko/20100101 Firefox/26.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-US,en;q=0.5 Accept-Encoding: gzip, deflate Origin: http:// myservername:8080 Access-Control-Request-Method: POST Access-Control-Request-Headers: content-type Connection: keep-alive Pragma: no-cache Cache-Control: no-cache Fiddler trace for output: HTTP/1.1 200 OK Server: Apache-Coyote/1.1 Access-Control-Allow-Origin: * Access-Control-Max-Age: 1728000 Access-Control-Allow-Methods: OPTIONS, HEAD, GET, POST, PUT, DELETE Access-Control-Allow-Headers: X-Requested-With Content-Type: application/json;charset=UTF-8 *Content-Length: 2* Date: Mon, 27 Jan 2014 19:38:38 GMT *{}* When there is not Content, it is returning an empty byte array. In this scecnario, Kibana is not making any subsequent calls as it is expecting the 'Content-Length: 0'. Is there any way I can make Kibana load graphs by making subsequent class even if the returned content is NOT empty for OPTIONS request? Thank you, -- Ram Kotamaraja | Solutions Architect, DMBI P: (617) 728-0595 | ext: 270595 M: (339) 368-1181 ram.kot...@altisource.com javascript: *** This email message and any attachments are intended solely for the use of the addressee. If you are not the intended recipient, you are prohibited from reading, disclosing, reproducing, distributing, disseminating or otherwise using this transmission. If you have received this message in error, please promptly notify the sender by reply email and immediately delete this message from your system. This message and any attachments may contain information that is confidential, privileged or exempt from disclosure. Delivery of this message to any person other than the intended recipient is not intended to waive any right or privilege. Message transmission is not guaranteed to be secure or free of software viruses. *** -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9f70b07a-a33b-4cef-9ed3-5eb026cc7d56%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Kibana not loading graphs
great, i like to hear from you who its going .. E-Mail: hendrik.s...@gmx.de h.j.s...@googlemail.com i...@saly.de PGP 0x22d7f6ec __ http://www.saly.de 2014-02-03 Kotamaraja, Ram ram.kotamar...@altisource.com: Hendrik, Thanks for the email. I am glad I am able to contribute. I will start testing the DSL part and there might more feedback coming your way. Thank you, -- Ram Kotamaraja | Solutions Architect, DMBI P: (617) 728-0595 | ext: 270595 M: (339) 368-1181 ram.kotamar...@altisource.com From: Hendrik [mailto:h.j.s...@googlemail.com] Sent: Monday, February 03, 2014 7:53 AM To: elasticsearch@googlegroups.com Cc: Kotamaraja, Ram Subject: Re: Kibana not loading graphs shoud be fixed with the following commits https://github.com/salyh/elasticsearch-security-plugin/commit/8301b02e374257040f5f56bd164a0ac10b835370 https://github.com/salyh/elasticsearch-security-plugin/commit/212c9fae027881bc0bb8b6028c2b42abc1579549 https://github.com/salyh/elasticsearch-security-plugin/commit/98e7980304205311def31f46a9b1291d4a500c50 Thank you Ram for your contributions KR Hendrik Am Montag, 27. Januar 2014 20:47:06 UTC+1 schrieb Kotamaraja, Ram: Hi, I am working in setting up Kibana and processing the requests through elastic security plugin: https://github.com/salyh/elasticsearch-security-plugin. When the plug in is enabled, I get the following response for OPTIONS query. Fiddler trace for Input: OPTIONS http://myservername:9200/myindex/_search HTTP/1.1 Host: myservername:9200 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:26.0) Gecko/20100101 Firefox/26.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-US,en;q=0.5 Accept-Encoding: gzip, deflate Origin: http:// myservername:8080 Access-Control-Request-Method: POST Access-Control-Request-Headers: content-type Connection: keep-alive Pragma: no-cache Cache-Control: no-cache Fiddler trace for output: HTTP/1.1 200 OK Server: Apache-Coyote/1.1 Access-Control-Allow-Origin: * Access-Control-Max-Age: 1728000 Access-Control-Allow-Methods: OPTIONS, HEAD, GET, POST, PUT, DELETE Access-Control-Allow-Headers: X-Requested-With Content-Type: application/json;charset=UTF-8 Content-Length: 2 Date: Mon, 27 Jan 2014 19:38:38 GMT {} When there is not Content, it is returning an empty byte array. In this scecnario, Kibana is not making any subsequent calls as it is expecting the 'Content-Length: 0'. Is there any way I can make Kibana load graphs by making subsequent class even if the returned content is NOT empty for OPTIONS request? Thank you, -- Ram Kotamaraja | Solutions Architect, DMBI P: (617) 728-0595 | ext: 270595 M: (339) 368-1181 ram.kot...@altisource.com *** This email message and any attachments are intended solely for the use of the addressee. If you are not the intended recipient, you are prohibited from reading, disclosing, reproducing, distributing, disseminating or otherwise using this transmission. If you have received this message in error, please promptly notify the sender by reply email and immediately delete this message from your system. This message and any attachments may contain information that is confidential, privileged or exempt from disclosure. Delivery of this message to any person other than the intended recipient is not intended to waive any right or privilege. Message transmission is not guaranteed to be secure or free of software viruses. *** *** This email message and any attachments are intended solely for the use of the addressee. If you are not the intended recipient, you are prohibited from reading, disclosing, reproducing, distributing, disseminating or otherwise using this transmission. If you have received this message in error, please promptly notify the sender by reply email and immediately delete this message from your system. This message and any attachments may contain information that is confidential, privileged or exempt from disclosure. Delivery of this message to any person other than the intended recipient is not intended to waive any right or privilege. Message transmission is not guaranteed to be secure or free of software viruses. *** -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails
Excluding multiple documents from query
I'm trying to use a Boolean query to exclude multiple documents from my result set using the must_not operator, when I specify a single _id using the term facet, it works just fine, but when I try more than one _id with either the term or terms facet, it doesn't display back correctly. Is there another way of doing this? My original (working) query is in the form of: { query: { bool: { must: [ { term: { content : data mining } } ], must_not: [ { term: { _id: 1234 } } ] } } } -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9e983990-3c5d-426b-b23d-7459837dfc12%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Elasticsearch gives failed to send join request to master on startup
Hi, I am planning to upgrade from 0.90.0 to 0.90.10, but I am facing problems in even starting version 0.90.10 As soon as i start a node on my local machine I get the following error: [2014-02-03 19:30:40,776][INFO ][discovery.zen] [Night Nurse] failed to send join request to master [[Peter Criss][HnGtgLAbS_CVmHZBKMQ5sQ][inet[/192.168.100.210:9300]]], reason [org.elasticsearch.transport.RemoteTransportException: [Peter Criss][inet[/192.168.100.210:9300]][discovery/zen/join]; org.elasticsearch.transport.RemoteTransportException: [Night Nurse][inet[/192.168.100.90:9300]][discovery/zen/join/validate]; java.io.IOException: Expected handle header, got [56]] [2014-02-03 19:30:43,834][WARN ][transport.netty ] [Night Nurse] Message not fully read (request) for [4169] and action [discovery/zen/join/validate], resetting The master node it tries to join to in always [Peter Criss] though as expected my node's name would change every time. I couldn't find any help regarding this anywhere. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b0574133-a266-4fe7-910f-f897b65622c9%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Excluding multiple documents from query
Try using a ids filter combined with a not filter instead; http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-ids-filter.html Filters are faster, cacheable (if desired) and will not influence scoring. If you do not want the excluded ids to be added to the facet counts, use the filter as part of a filtered query. -- Ivan On Mon, Feb 3, 2014 at 5:38 AM, James Massey jmas...@clearedgeit.comwrote: I'm trying to use a Boolean query to exclude multiple documents from my result set using the must_not operator, when I specify a single _id using the term facet, it works just fine, but when I try more than one _id with either the term or terms facet, it doesn't display back correctly. Is there another way of doing this? My original (working) query is in the form of: { query: { bool: { must: [ { term: { content : data mining } } ], must_not: [ { term: { _id: 1234 } } ] } } } -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9e983990-3c5d-426b-b23d-7459837dfc12%40googlegroups.com . For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQD%2BMSqGOSDzSBVjenmgHpdqMiPkj6dKnK9vLoXkHBM9%2Bg%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Excluding multiple documents from query
Have you tried the ids filter http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-ids-filter.html ? I created a simple index with some data as follows: curl -XPUT http://localhost:9200/test_index/curl -XPUT http://localhost:9200/test_index/_bulk; -d'{index:{_index:test_index,_type:docs,_id:1}}{content: abc}{index:{_index:test_index,_type:docs,_id:2}}{content: abc}{index:{_index:test_index,_type:docs,_id:3}}{content: abc}{index:{_index:test_index,_type:docs,_id:4}}{content: xyz}' and then queried like this: curl -XPOST http://localhost:9200/test_index/_search; -d'{ query: { bool: { must: [{ term: { content: abc }} ], must_not: [{ids: {values: [1, 2]}} ] } }}' and got back the result I expected: { took: 3, timed_out: false, _shards: { total: 2, successful: 2, failed: 0 }, hits: { total: 1, max_score: 0.5945348, hits: [ {_index: test_index,_type: docs,_id: 3, _score: 0.5945348,_source: { content: abc } } ] }} Here is a runnable example you can play with if you want: http://sense.qbox.io/gist/d0ef08e3c5b94706f85ede9ea95229e315dbfa6b http://sense.qbox.io/gist/d0ef08e3c5b94706f85ede9ea95229e315dbfa6b - Co-Founder and CTO, StackSearch, Inc. Hosted Elasticsearch at http://qbox.io -- View this message in context: http://elasticsearch-users.115913.n3.nabble.com/Excluding-multiple-documents-from-query-tp4048701p4048707.html Sent from the ElasticSearch Users mailing list archive at Nabble.com. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1391441463450-4048707.post%40n3.nabble.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Excluding multiple documents from query
That worked, thanks everyone! -- View this message in context: http://elasticsearch-users.115913.n3.nabble.com/Excluding-multiple-documents-from-query-tp4048701p4048708.html Sent from the ElasticSearch Users mailing list archive at Nabble.com. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1391443034070-4048708.post%40n3.nabble.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Improving Bulk Indexing
Jörg, Just so I understand this, if I were to index 100 MB worth of data total with chunk volumes of 5 MB each, this means I have to index 20 times.If I were to set the bulk size to 20 MB, I will have to index 5 times. This is a small data size, picture I have millions of documents. Are you saying the first method is better because of GC operations would be faster? Thanks again On Monday, February 3, 2014 9:47:46 AM UTC-5, Jörg Prante wrote: Note, bulk operates just on network transport level, not on index level (there are no transactions or chunks). Bulk saves network roundtrips, while the execution of index operations is essentially the same as if you transferred the operations one by one. To change refresh interval to -1, use an update settings request like this: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-update-settings.html ImmutableSettings.Builder settingsBuilder = ImmutableSettings.settingsBuilder(); settingsBuilder.put(refresh_interval, -1)); UpdateSettingsRequest updateSettingsRequest = new UpdateSettingsRequest(myIndexName) .settings(settingsBuilder); client.admin().indices() .updateSettings(updateSettingsRequest) .actionGet(); Jörg -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/531710e5-e42a-4ed1-a1e0-ad5d48e14146%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
0.90.11 and 1.0.0.RC2 released
Greetings! 0.90.11 is out, with caching improvements and added visibility into Lucene's memory use. We've also released what's hopefully going to be our final RC before 1.0.0. Check here for more info: http://www.elasticsearch.org/blog/0-90-11-1-0-0-rc2-released/ Thanks! Elasticsearch developers -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/dc8aa58f-d8c6-48ef-a2bc-75a31e0ba386%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Date Histogram Facet with custom bucketing
Hi All, I need a facet that does exactly what Date Histogram Facet does, but I need a different (complicated) monthly bucketing. Rather than the start/end of every month, I need a monthly period that begins/ends on the friday before the 2nd wednesday of every month Its been a little while since I last used Elastic - any pointers on how to go about this? Many thanks. -N -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/05136160-bc0f-419f-9e34-d4ba4a319b76%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: filtered alias and possible options for forcing filters for queries
It does, and that would leave me with only needing to solve the issue of a document update/load, since I won't be able to index against the alias with multiple indexes. For example: { actions : [ { add : { index : doc_1, alias : test, filter : { term : { _all : world } } } }, { add : { index : doc_2, alias : test, filter : { term : { _all : world } } } } . ] } If I end up creating a new index, I will need to update all user's aliases to add the new index. Although if it is missed, the worst case scenario will be denied access, rather than accidental leaking of data. That feels like a decent default. And for those with a filter I'll have to add it to each index line. It would be nice if the filter could be specified also globally for the alias. That will require some careful syncing, but should be reasonable. I will also need to solve the issue of possible updates to the alias and filter. I suspect that my auth layer can likely validate that the doc is accessibly via a HEAD request to the alias and if it shows up as visible I can permit the update to the actual index instead of the alias ... \@matthias On Monday, February 3, 2014 3:47:12 AM UTC-7, Luca Cavanna wrote: Hi, I would go for option 1, it just means that while creating the alias, you need to provide a list of indices, and for each index you 'll need to provide a filter, which in your case will always be the same if I got it right. Does it make sense? On Friday, January 31, 2014 7:07:02 PM UTC+1, Matthias Johnson wrote: In our application we have the need to restrict access to data. We would like to use filters for that. The filtered aliashttp://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-aliases.html seems like a nice way to do that and my original thought was to simply create an alias for each user and then add a filter to it as needed. That would work quite nicely, however we split the content potentially across many indices (think logstash and it's default per day index). Sadly a filter alias can only point to one index (wildcards don't seem to work in the 'index' either), which I assume is due to making it also work for document posting and updates With that said, is there something obvious I'm missing to get the desired functionality of applying/forcing a filter dependent on a user? Right now I'm considering 2 choices: 1) create per user aliases for each index they should have access to and then running the _search against /user_*/. That seems like it quickly becomes hard to scale and manage 2) store the user filters in ES and having a shim that inserts them at query time. Right now I already use a shim to authenticate via an HTTP header token through an nginx proxy Any thoughts and opinions would be most welcome. \@matthias -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/64454d01-f454-4ac3-8eff-9d1e984ea1b6%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[ANN] RSS River 1.0.0.RC1 is out!
I am pleased to announce the rssriver-1.0.0.RC1 release! RSS River Plugin offers a simple way to index RSS feeds into Elasticsearch. See full documentation on Github: https://github.com/dadoonet/rssriver Changes in this version include: Changes: o Support enclosure and category tags. Issue: https://github.com/dadoonet/rssriver/issues/16 . Thanks to LeStef. o Use BulkProcessor feature. Issue: https://github.com/dadoonet/rssriver/issues/21 . o Define a default mapping for RSS documents. Issue: https://github.com/dadoonet/rssriver/issues/20 . o Update Rome to new official repository. Issue: https://github.com/dadoonet/rssriver/issues/19 . o Update to elasticsearch 1.0.0.RC1. Test infra moved as well to elasticsearch test framework. Issue: https://github.com/dadoonet/rssriver/issues/18 . See documentation at: https://github.com/dadoonet/rssriver/ Have fun! -- David Pilato | Technical Advocate | Elasticsearch.com @dadoonet | @elasticsearchfr -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/etPan.52efcb37.5577f8e1.b49d%40MacBook-Air-de-David.local. For more options, visit https://groups.google.com/groups/opt_out.
nested filter returns 0 results
Hello, I have the following document: basiccontactsomeem...@email.com/contact/basic I index the above document relying on dynamic mappings. First I set an initial mapping as : { index1 : { date_detection:false, properties: { basic: { type: nested } } } } I try to search for this document using the following nested filter: NestedFilterBuilder filterBasic = new NestedFilterBuilder(basic, QueryBuilders.fieldQuery(basic.content, someem...@email.com)) When I print out the mappings (after indexing) I get the following: {requests:{date_detection:false,properties:{basic:{type:nested,properties:{contact:{type:string}} I get no results (with elastic search v. 0.90.0). Is the filter incorrect for the mapping provided? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f3dff12b-a40b-4e87-add6-70ee13c4d8c1%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Strange behaviour of custom mapping.. Need guidance.
Hi all, I have created mapping according to my domain data. And using unique id from one of the field from the source by setting path of _id=name of the field to be used. When i post the data in json format through curl command the id is generated from the source but when i insert the data through java program, id is generated randomly and not gets chosen fusing the path from source. the other thing is i have defined a field as string type but when i pass an integer data to it elasticsearch stores it in integer format without giving any exception :( What can i do..?? Thanks in advance, Bharvi Dixit -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4857b82e-4b7d-41b7-8fe2-3abcac79b30a%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: elasticsearch 1.0.0.RC1 jar hosted on mavenrepository depends on corrupted jar ?
Hi, It seems deleting my gradle cache did the trick this time (I tried last Friday, but the jar was still corrupted) Thanks! On Monday, February 3, 2014 12:35:48 AM UTC-8, Alexander Reelsen wrote: Hey, does this still happen? Just to ensure that no local corruption happened, can you delete your gradle cache? Any easy steps to reproduce? --Alex On Fri, Jan 31, 2014 at 2:50 AM, Maxime Nay maxi...@gmail.comjavascript: wrote: Hi, Today I upgraded the dependencies of one of our projects to use elasticsearch 1.0.0.RC1, and for some reason, when building the project (using gradle), I was getting a corrupted jar. (impossible to use it or open it) I fixed this issue by excluding the modules 'lucene-spatial' and 'lucene-suggest' (both version 4.6.0) Am I the only one having this issue ? If it is the case, would you have any suggestion for me to fix it ? Thanks ! Maxime -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c1c2a962-2e70-4640-aa37-e0ea8fc3b890%40googlegroups.com . For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1779d2b0-7906-4337-81b9-a6a76b9505b7%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
What is the difference between query_string and multi-match for querying docs ?
Hi, Can anyone please tell me a detailed difference between these two queries. I studied the documentation but not able to figure out the difference between two. Can anyone please explain it with some examples in a more detailed fashion. I expect query string to give me docs which matches maximum number of terms which are generated by search_analyzer to indexed docs but it is not happening that way. Please help !!! Thanks -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/fbc68c13-bcb9-40cb-9726-becbef14f278%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Improving Bulk Indexing
Not sure if I understand. If I had to index a pile of documents, say 15M, I would build bulk request of 1000 documents, where each doc is in avg ~1K so I end up at ~1MB. I would not care about different doc size as they equal out over the total amountThen I send this bulk request over the wire. With a threaded bulk feeder, I can control concurrent bulk requests of up to the number of CPU cores, say 32 cores. Then repeat. In total, I send 15K bulk requests. The effect is that on the ES cluster, each bulk request of 1M size allocates only few resources on the heap and the bulk request can be processed fast. If the cluster is slow, the client sees the ongoing bulk requests piling up before bulk responses are returned, and can control bulk capacity against a maximum concurrency limit. If the cluster is fast, the client receives responses almost instantly, and the client can decide if it is more appropriate to increase bulk request size or concurrency. Does it make sense? Jörg On Mon, Feb 3, 2014 at 5:06 PM, ZenMaster80 sabdall...@gmail.com wrote: Jörg, Just so I understand this, if I were to index 100 MB worth of data total with chunk volumes of 5 MB each, this means I have to index 20 times.If I were to set the bulk size to 20 MB, I will have to index 5 times. This is a small data size, picture I have millions of documents. Are you saying the first method is better because of GC operations would be faster? Thanks again On Monday, February 3, 2014 9:47:46 AM UTC-5, Jörg Prante wrote: Note, bulk operates just on network transport level, not on index level (there are no transactions or chunks). Bulk saves network roundtrips, while the execution of index operations is essentially the same as if you transferred the operations one by one. To change refresh interval to -1, use an update settings request like this: http://www.elasticsearch.org/guide/en/elasticsearch/ reference/current/indices-update-settings.html ImmutableSettings.Builder settingsBuilder = ImmutableSettings. settingsBuilder(); settingsBuilder.put(refresh_interval, -1)); UpdateSettingsRequest updateSettingsRequest = new UpdateSettingsRequest(myIndexName) .settings(settingsBuilder); client.admin().indices() .updateSettings(updateSettingsRequest) .actionGet(); Jörg -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/531710e5-e42a-4ed1-a1e0-ad5d48e14146%40googlegroups.com . For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoF9WFcFD5pgjdjV1fM7iJhwZdf%2B4zzhYzGRKtFbhN55bA%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
What is the meaning of field^4 from scoring perspective ?
Hi, I was going through the documentation of ES where they have mentioned that we can use boost for a certain field. Boosting Use the *boost* operator ^ to make one term more relevant than another. For instance, if we want to find all documents about foxes, but we are especially interested in quick foxes: quick^2 fox The default boost value is 1, but can be any positive floating point number. Boosts between 0 and 1 reduce relevance. Boosts can also be applied to phrases or to groups: john smith^2 (foo bar)^4 My doubt is what is the meaning of ^2 means is it like multiplying by a factor of 2 or multiplying the whole document score by 2 ? Thanks -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/200b5523-8d4c-4b53-8333-b3a452954d32%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Best way to handle sub documents
Hi, I have a problematic about the best way to handle sub documents. My app have to manipulate companies and a company can have several locations. I must be able to find companies by their countries or their cities. I don't need an exact address. For now, I have the following mapping: mappings: { company : { properties : { name : { type : string, store : yes }, type : { type : string, store : yes }, employees : { type : string, store : yes }, locations : { country : { type : string, store : yes }, city : { type : string, store : yes }, city : { type : geo_point, store : yes }, } } } } I would like to have your feedback on several things: * What's the best nested strategy in my usecase : innerobject (like the example), nested or parent/child? Other? And why? * I tried the 3 possibilities but I never successfully find the results of the following query: filtering on a company type (web agency for example), a company size and a country. Is it possible? * I also use kibana to display some data, but I can't use the nested attributes in the widgets. Is it implemented? Thanks for your help, Kevin -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d8fb6bce-5760-4a2f-a8ae-9cb7b774eccc%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[hadoop] Store GeoJson using PIG
How can I store GeoJson using PIG? I tried to use TOBAG(lon, lat), however I got in the location stored in the following form: location: [ { 0: 23.80323889 } { 0: 44.31903611 } ] Regards, Dumitru -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/56fa268b-d3be-499f-a93d-b49e74779208%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Marvel Document Creation Concern
Same issue here! 1GB of total data with very little traffic and Marvel is generating a 4.5GB index every day. On Monday, February 3, 2014 3:11:50 PM UTC-5, David Harrigan wrote: Hi, Firstly, thanks for an awesome plugin. I'm trying it out and so far very impressed. However, I'm very concerned about something. I have a small dataset of about 30,000 documents which only grows a few thousand every few days. Since installing Marvel (on launch day), my total number of documents has shot up to over 2 million and growing! It seems that Marvel is writing documents every few milliseconds into it's indices. I'm watching the Document Count just going up and up! This leads me on to my first question. Is this normal? Secondly, if I don't want this number of documents what do I do? Can I get Marvel to delete data over X days old? Any guidance and advice on this would be greatly appreciated. Thank you kindly, -=david=- -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a3b193cb-0f32-454d-9910-1bec27fddec3%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Discrepancy on processed nodes
Hello, I need help. I just updated to 0.90.3 to 0.90.9 Elasticsearch and my 2 nodes cluster I with different versions ... I can not update the version of the node 0.90.3 or delete. Any idea ? Thanks. $ curl -XGET http://localhost:9200/_nodes/process?pretty { ok : true, cluster_name : elasticsearch, nodes : { YSSDK3IqQIuquOuFUv8wKg : { name : David, transport_address : inet[/127.0.0.1:9300], hostname : Webmaster-Server-2, version : 0.90.10, http_address : inet[/127.0.0.1:9200], attributes : { max_local_storage_nodes : 1, master : true }, process : { refresh_interval : 1000, id : 18494, max_file_descriptors : 4096, mlockall : true } }, 0Wjpr1C2Qcm-l3t6fyZ-UQ : { name : Book, transport_address : inet[/26.220.169.56:9301], hostname : Webmaster-Server-2, version : 0.90.3, attributes : { client : true, data : false }, process : { refresh_interval : 1000, id : 14758, max_file_descriptors : 4096, mlockall : false } } } } -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d5635fdc-beb4-4bf0-a2a9-abb09c9f55a8%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: ES 0.20.5 adding a new node to a running cluster (unicast mode)
yes of course in one work: unicast seems like an overhead ... , think of a scenario that you need to add more nodes to the cluster (pretty common one for us) you need to change the yml config on each node and restart the cluster Node by Node making sure that all other nodes sees the new node I'm pretty sure this is incorrect actually. I suspect if you had added some existing nodes' IPs to the discovery list on the new node, all the nodes would have formed a cluster. Information about what nodes are in the cluster is stored canonically on the master and replicated around - once a node is known about, its existence should be communicated to the other nodes. Cheers, Dan -- Dan Fairs | dan.fa...@gmail.com | @danfairs | secondsync.com -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/E997488E-7B62-4966-9074-68F55A7345EA%40gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Binding to Public / Private IPs
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-network.htmlwill help Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 4 February 2014 07:07, Olu oowos...@gmail.com wrote: ElasticSearch noob here. How do I specify the IP address that ES binds to?? I have an EC2 instance that runs ES on it. curling localhost:9200 on that machine provides me with ES data. However, trying to query ES using the public IP address of the EC2 instance (curl http://public_ip_address:9200/_search?...), this does not work. I have turned off all firewalls on the server. iptables and ip6tables have all been turned on. Port 9200 and 9300 (default ports ES uses) are also open. Taking a look at the logs, I can see the ES binds to the private IP address of the ec2 instance when starting up. I've tried changing the network bind_host, publish_host variables to the explicit public IP address but ES still binds to the private IP address when starting. I'm not sure what else to try. Trying to set up Kibana fails with this setup, b/c even though KIbana and ES are on the same server, Kibana is using the the url http://public_ip_address:9200 to try to connect to ES but it cannot connect using that url. There might be an easy solution to this but any help in solving this would be greatly appreciated. Thanks!! -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1f74bfc7-2784-4053-8944-4a1048a2a04e%40googlegroups.com . For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624bdPubFoBzn-rXdaCkPA8Er7J-zARMEVbBPLmSBCi_bzw%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Binding to Public / Private IPs
I've tried taking a look at the network settings. I've even downloaded the cloud-aws plugin and tried changing the host settings to no avail. On Monday, February 3, 2014 4:04:35 PM UTC-5, Mark Walkom wrote: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-network.htmlwill help Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com javascript: web: www.campaignmonitor.com On 4 February 2014 07:07, Olu oowo...@gmail.com javascript: wrote: ElasticSearch noob here. How do I specify the IP address that ES binds to?? I have an EC2 instance that runs ES on it. curling localhost:9200 on that machine provides me with ES data. However, trying to query ES using the public IP address of the EC2 instance (curl http://public_ip_address:9200/_search?...), this does not work. I have turned off all firewalls on the server. iptables and ip6tables have all been turned on. Port 9200 and 9300 (default ports ES uses) are also open. Taking a look at the logs, I can see the ES binds to the private IP address of the ec2 instance when starting up. I've tried changing the network bind_host, publish_host variables to the explicit public IP address but ES still binds to the private IP address when starting. I'm not sure what else to try. Trying to set up Kibana fails with this setup, b/c even though KIbana and ES are on the same server, Kibana is using the the url http://public_ip_address:9200 to try to connect to ES but it cannot connect using that url. There might be an easy solution to this but any help in solving this would be greatly appreciated. Thanks!! -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1f74bfc7-2784-4053-8944-4a1048a2a04e%40googlegroups.com . For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ea6c73b3-8b7a-4b8d-a417-2d0461f220e2%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: nested filter returns 0 results
Same issue with the nested filter - returns no results. I actually played around with removing the nested type from the mapping, and that's how I got to write the plain filter On Monday, February 3, 2014 12:10:09 PM UTC-5, Andra Bennett wrote: Hello, I have the following document: basiccontactsomeem...@email.com/contact/basic I index the above document relying on dynamic mappings. First I set an initial mapping as : { index1 : { date_detection:false, properties: { basic: { type: nested } } } } I try to search for this document using the following nested filter: NestedFilterBuilder filterBasic = new NestedFilterBuilder(basic, QueryBuilders.fieldQuery(basic.content, someem...@email.com)) When I print out the mappings (after indexing) I get the following: {requests:{date_detection:false,properties:{basic:{type:nested,properties:{contact:{type:string}} I get no results (with elastic search v. 0.90.0). Is the filter incorrect for the mapping provided? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4a642321-2925-4b90-a4d1-0348422171c9%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Discrepancy on processed nodes
Hi, Your post looks somewhat similar to something I posted a few days ago. Am unclear why you may be stuck with 2 nodes running different versions of ES. They both need to be the same version to function as a cluster. Your output says your upgraded node is ES 0.90.10 (not 0.90.9). Surely you can upgrade your node running 0.90.3 to this same version? If you are having a problem, what are the steps you tried and the problem you ran into? Tony On Monday, February 3, 2014 12:17:45 PM UTC-8, David Patiashvili wrote: Hello, I need help. I just updated to 0.90.3 to 0.90.9 Elasticsearch and my 2 nodes cluster I with different versions ... I can not update the version of the node 0.90.3 or delete. Any idea ? Thanks. $ curl -XGET http://localhost:9200/_nodes/process?pretty { ok : true, cluster_name : elasticsearch, nodes : { YSSDK3IqQIuquOuFUv8wKg : { name : David, transport_address : inet[/127.0.0.1:9300], hostname : Webmaster-Server-2, version : 0.90.10, http_address : inet[/127.0.0.1:9200], attributes : { max_local_storage_nodes : 1, master : true }, process : { refresh_interval : 1000, id : 18494, max_file_descriptors : 4096, mlockall : true } }, 0Wjpr1C2Qcm-l3t6fyZ-UQ : { name : Book, transport_address : inet[/26.220.169.56:9301], hostname : Webmaster-Server-2, version : 0.90.3, attributes : { client : true, data : false }, process : { refresh_interval : 1000, id : 14758, max_file_descriptors : 4096, mlockall : false } } } } -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/bb162659-eec8-4b2f-8145-0fbf3045b072%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Restarting a cluster with existing data - Status Red?
Thx for the input. Nope, ES is being shutdown normally usually by simply stopping the configured ES service, and only after it fully completes executing a shutdown. Tony On Monday, February 3, 2014 2:09:08 PM UTC-8, InquiringMind wrote: Tony, You're not doing a kill -9 during shutdown, I hope. If so, that would result in a large window of opportunity for index corruption. Just something to check for... We always do a normal kill to the pid within the pid file to shut down an ES instance before shutting down the machine itself, or before upgrading the software.And we have never seen any issues with the cluster coming back up in the same (usable, usually yellow or green) state that it was before the shutdown. On two occasions we have had machines power off due to thermal overload in the server room. This is a drastic event that is usually as dangerous (to disk data integrity) as a kill -9, but in these cases there wasn't any load on the machine and we experienced no data loss nor did we see the cluster as anything but green once the machine came back up and the node restarted. Brian -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b78f9314-f176-4b3a-8a96-b9d88f262563%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: cluster reroute and potential data loss
So during a cluster restart sometimes we get nodes that have unallocated shards, both the primary and replica will be unallocated. They stay stuck in this state, leaving the cluster red. If I force allocation, with allow_primary=true, I get a new blank shard, all docs lost. If I force allocation, with allow_primary=false, I get an error: { error: RemoteTransportException[[yournodename][inet[/10.1.1.1:9300]][cluster/reroute]]; nested: ElasticSearchIllegalArgumentException[[allocate] trying to allocate a primary shard [yourindexname][4]], which is disabled]; , status: 400 } Once the cluster gets to this state, am I just out of luck on recovering the data in these shards? Mark On Mon, Feb 3, 2014 at 4:57 PM, Nikolas Everett nik9...@gmail.com wrote: If all replicas of a particular shard are unallocated and you allow_primary allocate one then it'll allocate empty. If a node that had some data for that shard comes back it won't be able to use that data because the shard has been allocated empty. On Mon, Feb 3, 2014 at 4:42 PM, Mark Conlin mark.con...@gmail.com wrote: I was reading some ES docohttp://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-reroute.htmland stumbled upon this part of the Cluster Reroute API: allocate: *Allocate an unassigned shard to a node. It also accepts the allow primary flag to explicitly specify that it is allowed to explicitly allocate a primary shard (might result in data loss).* Why might this result in data loss? If I use: POST /_cluster/reroute { commands : [ { cancel : { index : myindex, shard : 4, node: somenode, allow_primary:true } } ] } To get a node that has unallocated shards back to green, how will I know if data loss has occured? How/why is the data being lost? Thanks, Mark -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0829b11c-d18a-4f6e-9cf4-67a94fd55daa%40googlegroups.com . For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to a topic in the Google Groups elasticsearch group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/jeaefaiC6d8/unsubscribe. To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPmjWd1e_y78fPt_N%3DUcTc4Grad_L4fVMLzT%2By3dQ232dsEfEQ%40mail.gmail.com . For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CADfBvi9HnumkmGKe9ZOHQK3iBndbYUbR49BBtB6azhQPBns%2BOw%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
[ANN] JDBC River 1.0.0.RC2.1
Hi, this is the first release of JDBC river plugin for Elasticsearch 1.0.0.RC2 Changes: - update to Elasticsearch 1.0.0.RC2 - plugin version support - support for java.sql.Array - SQL array test for Postgresql Best, Jörg -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGv-MWoCMoNfYZnRxjQZLz5OG_OuN0g3O%2BK3vY1QLmCmw%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
[ANN] Knapsack plugin 1.0.0.RC2.1
Hi, this is the first release of the Knapsack import/export plugin for Elasticsearch 1.0.0.RC2 Changes: - update to Elasticsearch 1.0.0.RC2 - plugin version support - bug fix: TransportClient now initializes only with elasticsearch-support plugin and ignores other plugins - added abort command to kill all ongoing imports or exports from REST Best, Jörg -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGuGR9j7GRGBEpWs5KLhPBE0PaiidKobpUV8q7OLo7SYQ%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Elasticsearch gives failed to send join request to master on startup
So the versions of 192.168.100.210 and 192.168.100.90 are different? Although elasticsearch is supposed to support cluster that differ between minor versions, this scenario is no longer true with later versions of 0.90. -- Ivan On Mon, Feb 3, 2014 at 6:02 AM, Sambhav Sharma sambhav.sha...@minjar.comwrote: Hi, I am planning to upgrade from 0.90.0 to 0.90.10, but I am facing problems in even starting version 0.90.10 As soon as i start a node on my local machine I get the following error: [2014-02-03 19:30:40,776][INFO ][discovery.zen] [Night Nurse] failed to send join request to master [[Peter Criss][HnGtgLAbS_CVmHZBKMQ5sQ][inet[/192.168.100.210:9300]]], reason [org.elasticsearch.transport.RemoteTransportException: [Peter Criss][inet[/192.168.100.210:9300]][discovery/zen/join]; org.elasticsearch.transport.RemoteTransportException: [Night Nurse][inet[/192.168.100.90:9300]][discovery/zen/join/validate]; java.io.IOException: Expected handle header, got [56]] [2014-02-03 19:30:43,834][WARN ][transport.netty ] [Night Nurse] Message not fully read (request) for [4169] and action [discovery/zen/join/validate], resetting The master node it tries to join to in always [Peter Criss] though as expected my node's name would change every time. I couldn't find any help regarding this anywhere. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b0574133-a266-4fe7-910f-f897b65622c9%40googlegroups.com . For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBKS_Hb2OPVPcMJ%2BawnDUB_E%3DkqZSY3Nn8y%3DujHBX76KQ%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: cant create cluster on ec2
Hi Jonathan Just be clear...you could not cluster? I wili try the lastest stable release cuase I am giving up on 1.0.0RC. Thanks On Sun, Feb 2, 2014 at 10:10 AM, Jonathan Goodwin jonathanwgood...@gmail.com wrote: Hello David and David, I took a look at the logs / configs posted and mine are mostly similar (though i'm not using any rivers) and I'm having the same issue. I've tested on 0.90.9 and 0.90.7, I hadn't tried the 1.0.0RC yet because the logstash website suggested I stick to 0.90.9 If you need any more logs / config samples I'd be happy to provide those. Best, JG On Friday, January 31, 2014 3:46:07 PM UTC-5, David Montgomery wrote: Hi, In chef I used -N flag which changes the host name when using chef-client. I removed the flag and built again. No difference. I have two nodes running. Both have the name cluster name, ports are 100% wide open...and wow..ES cant talk despite both running in the same region. So...ES...any suggestions? root@ip-10-238-131-236:/var/log/elasticsearch# hostname ip-10-238-131-236 root@ip-10-238-131-236:/var/log/elasticsearch# curl -XGET ' http://127.0.0.1:9200/_cluster/health?pretty=true' { cluster_name : mizilaelasticsearch, status : green, timed_out : false, number_of_nodes : 1, number_of_data_nodes : 1, active_primary_shards : 0, active_shards : 0, relocating_shards : 0, initializing_shards : 0, unassigned_shards : 0 } On Fri, Jan 31, 2014 at 12:20 AM, Ivan Brusic iv...@brusic.com wrote: Did you override your network.host configuration parameter by any chance? -- Ivan On Wed, Jan 29, 2014 at 1:56 PM, David Montgomery davidmo...@gmail.comwrote: Hi, Details 1) both servers are sing the same sec group 2 all ports between 9200 and 9300 are open 0.0.0.0/0 3) ec2 instacances in us-east region 4) Did a fresh boot on 2 servers both servers are running. Both are not talking 5) Attached are the logs for both servers and the common elasticsearch.yml file 6) Below is he chef recipe I wrote. I am using 1.0.0.RC1 with elasticsearch-cloud-aws/2.0.0.RC1 /usr/share/elasticsearch/plugins/cloud-aws ubuntu@ip-10-73-142-187:/usr/share/elasticsearch/plugins/cloud-aws$ ls aws-java-sdk-1.3.32.jar commons-codec-1.4.jar commons-logging-1.1.1.jar elasticsearch-cloud-aws-2.0.0.RC1.jar httpclient-4.1.1.jar httpcore-4.1.jar data_bag(my_data_bag) db = data_bag_item(my_data_bag, my) AWS_ACCESS_KEY_ID = db[node.chef_environment][' aws']['AWS_ACCESS_KEY_ID'] AWS_SECRET_ACCESS_KEY = db[node.chef_environment][' aws']['AWS_SECRET_ACCESS_KEY'] version = '1.0.0.RC1' remote_file #{Chef::Config[:file_cache_path]}/elasticsearch-#{version}.deb do source https://download.elasticsearch.org/ elasticsearch/elasticsearch/elasticsearch-#{version}.deb action :create_if_missing end dpkg_package #{Chef::Config[:file_cache_path]}/elasticsearch-#{version}.deb do action :install end bash add_aws do cwd /usr/share/elasticsearch code -EOH git clone https://github.com/elasticsearch/elasticsearch- cloud-aws.git /usr/share/elasticsearch/bin/plugin -install elasticsearch/elasticsearch-cloud-aws/2.0.0.RC1 EOH not_if {File.exists?(/usr/share/elasticsearch/elasticsearch- cloud-aws)} end service elasticsearch do supports :restart = true, :start = true, :stop = true action [ :enable, :start] end template /etc/elasticsearch/logging.yml do path /etc/elasticsearch/logging.yml source logging.yml.erb owner root group root mode 0755 notifies :restart, resources(:service = elasticsearch) end template /etc/elasticsearch/elasticsearch.yml do path /etc/elasticsearch/elasticsearch.yml source elasticsearch.yml.erb owner root group root mode 0755 variables :AWS_ACCESS_KEY_ID = #{AWS_ACCESS_KEY_ID}, :AWS_SECRET_ACCESS_KEY = #{AWS_SECRET_ACCESS_KEY}, :region = us-east-1 notifies :restart, resources(:service = elasticsearch) end On Tue, Jan 28, 2014 at 6:40 AM, David Montgomery davidmo...@gmail.com wrote: sure..let me the logs together and will post two files ..one log file for each of the servers. Thanks On Tue, Jan 28, 2014 at 6:25 AM, David Pilato da...@pilato.fr wrote: Is it a known problem? Not yet. That's the reason I'm asking for logs. I would like to understand what is happening here. So full logs on both nodes would help. The way you set unicast is fine to me. But please, could you share with me (even through Direct mail) your elasticsearch.yml file and your logs in TRACE level? You can replace your credentials settings with . Please don't change anything else. I'd like to reproduce your case. Thanks! -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 27 janv. 2014 à 22:57, David Montgomery davidmo...@gmail.com a écrit : I have to say..it this a know problem? I removed the keys. So.. should the servers still cluster? Wellits
Re: Improving Bulk Indexing
Thanks again for clarifying this, I think I understand this, what I was referring to in my prior posts was the difference between setting 1000 documents vs 1 documents, I was thinking the bigger the chunk volume will produce less over the wire index requests, but I understand your reasoning behind thrashing and slow GC. The numbers below kind of support my theory, as I increased the chunk to 10 MB or 10,000 docs, I saw a slight improvement in total indexing time (I think). I would like to get your/others feedback on some numbers/benchmarks, I tested with bulkrequest and with bulkprocessor, both similar results (I seem to think it is slow?) - Same source for testing (85 MB) - Running one node/1 shard/ 0 replica on local mac book 8 cores, 4G RAM - Used Bulk batch size 1MB concurrentRequests = 1, I indexed 85 MB in ~17 seconds. - Used Bulk batch size 1MB concurrentRequests = 8, I indexed 85 MB in ~15 seconds. - Used Bulk batch size 5MB concurrentRequests = 1, I indexed 85 MB in ~15 seconds. - Used Bulk batch size 5MB concurrentRequests = 8, I indexed 85 MB in ~17 seconds. - Used Bulk batch size 10MB concurrentRequests = 1, I indexed 85 MB in ~13 seconds. - Used Bulk batch size 10MB concurrentRequests = 8, I indexed 85 MB in ~13 seconds. - Using number of docs -- - Used Bulk 1000 docs concurrentRequests = 1, I indexed 85 MB in ~15 seconds. - Used Bulk 1000 docs concurrentRequests = 8, I indexed 85 MB in ~13 seconds. - Used Bulk 1 docs concurrentRequests = 1, I indexed 85 MB in ~15 seconds. - Used Bulk 1 docs concurrentRequests = 8, I indexed 85 MB in ~12/~13 seconds. Ok, So an average of 15 sec for 85MB, 5.5 MB/sec. Why do I think this is slow. I am not sure if I am doing the right math, but for 20 million docs (27 TB data), this will take 2 days? I understand with better machines like SSD and more RAM I will get better results. However, I would like to optimize what I have now to the fullest before scaling up. What other configurations can I tweak to improve for my current test? .put(client.transport.sniff, true) .put(refresh_interval, -1) .put(number_of_shards, 1) .put(number_of_replicas, 0) On Monday, February 3, 2014 2:02:32 PM UTC-5, Jörg Prante wrote: Not sure if I understand. If I had to index a pile of documents, say 15M, I would build bulk request of 1000 documents, where each doc is in avg ~1K so I end up at ~1MB. I would not care about different doc size as they equal out over the total amountThen I send this bulk request over the wire. With a threaded bulk feeder, I can control concurrent bulk requests of up to the number of CPU cores, say 32 cores. Then repeat. In total, I send 15K bulk requests. The effect is that on the ES cluster, each bulk request of 1M size allocates only few resources on the heap and the bulk request can be processed fast. If the cluster is slow, the client sees the ongoing bulk requests piling up before bulk responses are returned, and can control bulk capacity against a maximum concurrency limit. If the cluster is fast, the client receives responses almost instantly, and the client can decide if it is more appropriate to increase bulk request size or concurrency. Does it make sense? Jörg On Mon, Feb 3, 2014 at 5:06 PM, ZenMaster80 sabda...@gmail.comjavascript: wrote: Jörg, Just so I understand this, if I were to index 100 MB worth of data total with chunk volumes of 5 MB each, this means I have to index 20 times.If I were to set the bulk size to 20 MB, I will have to index 5 times. This is a small data size, picture I have millions of documents. Are you saying the first method is better because of GC operations would be faster? Thanks again On Monday, February 3, 2014 9:47:46 AM UTC-5, Jörg Prante wrote: Note, bulk operates just on network transport level, not on index level (there are no transactions or chunks). Bulk saves network roundtrips, while the execution of index operations is essentially the same as if you transferred the operations one by one. To change refresh interval to -1, use an update settings request like this: http://www.elasticsearch.org/guide/en/elasticsearch/ reference/current/indices-update-settings.html ImmutableSettings.Builder settingsBuilder = ImmutableSettings. settingsBuilder(); settingsBuilder.put(refresh_interval, -1)); UpdateSettingsRequest updateSettingsRequest = new UpdateSettingsRequest(myIndexName) .settings(settingsBuilder); client.admin().indices() .updateSettings(updateSettingsRequest) .actionGet(); Jörg -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view
upgrade 0.20 to 1.0
Hi All it seems I have been slacking on keeping up do date with updates, will it be possible to upgrade a 0.22.1 cluster to 1.0 or is it better to do it in two steps i.e. 0.90.x first Thanks GX -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/745d1173-2045-4c69-83fc-95e9e078873c%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Cakephp Plugin for ElasticSearch
Need cakephp plugin for Elastic Search. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d1015922-5e34-4735-b456-07d54d538920%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: upgrade 0.20 to 1.0
You will probably want to upgrade to 0.90.0 first. v1.0 is still RC so if this is production you might want to hold off. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 4 February 2014 16:21, GX mailme...@gmail.com wrote: Hi All it seems I have been slacking on keeping up do date with updates, will it be possible to upgrade a 0.22.1 cluster to 1.0 or is it better to do it in two steps i.e. 0.90.x first Thanks GX -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/745d1173-2045-4c69-83fc-95e9e078873c%40googlegroups.com . For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624YymLHUkN8mNro5E5osVLyFRAvaYC4C%2BTZq9%2Bh0RBey8w%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: upgrade 0.20 to 1.0
The *easiest* is to do an upgrade directly to v1.0, but I highly doubt if that will even work after the upgrade due to the number of changes between 0.2X, 0.90.X and 1.0.0. And frankly, you'd be insane to consider it if you wanted to keep your data. If you can export your data to disk, then reimport/reindex it, that might be an easier option for you. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 4 February 2014 17:20, GX mailme...@gmail.com wrote: Hi Mark Thanks for the seedy reply, I was not thinking of an immediate upgrade since it needs to be scheduled tested etc, but rather considering skipping 90 since 1.0 is so close to release and doing a one step upgrade instead of a 2 step so rather than do an upgrade now then again in a few months just go strait for 1.0 when its final release is out, my main concern is if its possible to skip 90 in production, of course testing will be done however it never matches to real word circumstances since replicating a test environment with 4 nodes and gigabytes of data may in theory be possible but actual load and results are difficult do. I suppose in the extreme case we can just do a fresh install of 1.0 and reindex all the data it just means several hours of 'downtime' so I guess the real question is which route is easiest as opposed to safest 0.20 - 0.90 - 1.0 0.20 - 1.0 nuke and pave strait to 1.0 GX On Tuesday, February 4, 2014 7:42:11 AM UTC+2, Mark Walkom wrote: You will probably want to upgrade to 0.90.0 first. v1.0 is still RC so if this is production you might want to hold off. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 4 February 2014 16:21, GX mail...@gmail.com wrote: Hi All it seems I have been slacking on keeping up do date with updates, will it be possible to upgrade a 0.22.1 cluster to 1.0 or is it better to do it in two steps i.e. 0.90.x first Thanks GX -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/745d1173-2045-4c69-83fc-95e9e078873c% 40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/eb82d6a6-dc7f-489d-859b-31e712891a2e%40googlegroups.com . For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624buteFywOBeouEWzCq1J6i%2B1A9ZRpaJEK5KoaQbPLc0MQ%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
1.0.0.RC2 lots of [WARN ][discovery.zen.ping.multicast] [Pixx] failed to read requesting data from ***
Hi, It seems that the default ./bin elasticsearch are changed to run in the front instead of the back end, is this true? As of beta2, still seems to run fine. But when I upgrade to 1.0.0.Rc1, or RC2 when running ./bin elasticsearch, it starts to run in the front, and gives me lots of warnings like: [WARN ][discovery.zen.ping.multicast] [Pixx] failed to read requesting data from /10.93.69.138:54328 java.io.IOException: No transport address mapped to [21623] at org.elasticsearch.common.transport.TransportAddressSerializers.addressFromStream(TransportAddressSerializers.java:71) at org.elasticsearch.cluster.node.DiscoveryNode.readFrom(DiscoveryNode.java:267) at org.elasticsearch.cluster.node.DiscoveryNode.readNode(DiscoveryNode.java:257) at org.elasticsearch.discovery.zen.ping.multicast.MulticastZenPing$Receiver.run(MulticastZenPing.java:410) at java.lang.Thread.run(Thread.java:662) Is this expected? Thanks, Chen -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5ca1d17c-1b4a-4f4e-bd8b-fb8b8bd0896b%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Is ElasticSearch 0.90.7 compatible with jetty-0.90.0
I've installed jetty-0.90.0 with the default configurations provided by jetty to secury the ES service. I check to see if it's working I get the following message: HTTP/1.1 200 OK Content-Type: text/plain;charset=UTF-8 Access-Control-Allow-Origin: * Content-Length: 0 Server: Jetty(8.1.4.v20120524) But if I try to refresh using $ curl http://localhost:9200/_refresh I get the typical message that it's working {ok:true,_shards:{total:19,successful:19,failed:0}} Any ideas what I may be doing wrong? Thanks -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ec5e3ee1-ac2b-45c3-9cb8-699fb839604d%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Elastic Search giving invalid format error when using decimal numbers
Hello everyone - I just observed something and I was wondering if anyone has seen something similar. I have a mapping in which there are strings, floats, double etc. Now for the following query, I get an error: IllegalArgumentException[Invalid format: 3000.9 is malformed at .9]; { query: { filtered: { query: { simple_query_string: { query: *3000.9*, fields: [ ..//this is where I give a list of all the different fields: strings, numbers, dates etc ], default_operator: and } } } } } However, when I search for 3000 (note that there is no decimal), it works fine. Any clue what I might be doing wrong here? -Amit. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAAOGaQ%2B7FkftMgXiGVEndrvPxG%2BsXUENtj0vkKU4_fV0MVO4uw%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
kibana ES version errors
I'm running ES 0.90.10 and Kibana master. kibana emits an error saying it needs ES 0.90.9 or better. But kibana does not say what version it thinks its using, or even what url is using. From ES: curl http://10.0.136.6:9200 { ok : true, status : 200, name : s-ops-monitor-01, version : { number : 0.90.10, build_hash : 0a5781f44876e8d1c30b6360628d59cb2a7a2bbb, build_timestamp : 2014-01-10T10:18:37Z, build_snapshot : false, lucene_version : 4.6 }, tagline : You Know, for Search } From kibana config: elasticsearch: http://10.0.136.6:9200, Is there anyway to debug kibana? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c6a0f425-4b0b-4c89-b39b-f644670bfe35%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: 1.0.0.RC2 lots of [WARN ][discovery.zen.ping.multicast] [Pixx] failed to read requesting data from ***
Hey, yes, elasticsearch now starts in the foreground by default, please see http://www.elasticsearch.org/guide/en/elasticsearch/reference/master/breaking-changes.htmlfor a list of breaking changes compared to 0.90 The other problem might stem from the problem, that you are still using an old elasticsearch version somewhere (maybe a node client?) or different JVM versions (but if it worked before, I rather think the first). Can you check that? Also, you got an IP (10.93.x.y), does that run a valid elasticsearch instance or just something that tries to connect to your cluster? --Alex On Tue, Feb 4, 2014 at 8:18 AM, Chen Wang chen.apache.s...@gmail.comwrote: Hi, It seems that the default ./bin elasticsearch are changed to run in the front instead of the back end, is this true? As of beta2, still seems to run fine. But when I upgrade to 1.0.0.Rc1, or RC2 when running ./bin elasticsearch, it starts to run in the front, and gives me lots of warnings like: [WARN ][discovery.zen.ping.multicast] [Pixx] failed to read requesting data from /10.93.69.138:54328 java.io.IOException: No transport address mapped to [21623] at org.elasticsearch.common.transport.TransportAddressSerializers.addressFromStream(TransportAddressSerializers.java:71) at org.elasticsearch.cluster.node.DiscoveryNode.readFrom(DiscoveryNode.java:267) at org.elasticsearch.cluster.node.DiscoveryNode.readNode(DiscoveryNode.java:257) at org.elasticsearch.discovery.zen.ping.multicast.MulticastZenPing$Receiver.run(MulticastZenPing.java:410) at java.lang.Thread.run(Thread.java:662) Is this expected? Thanks, Chen -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5ca1d17c-1b4a-4f4e-bd8b-fb8b8bd0896b%40googlegroups.com . For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGCwEM_SHWxjxsOLHdPN_A%3D3EVhRkE8UxQktAkSEAH5qhOd%2BMQ%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.