Malformed Json for an object with a rawfield only using XContentBuilder
Hi, I couldn't find a way to create a valid document using the XContentBuilder including a raw field only. Following the code I used: jsonStr = {\field\:\value\}; XContentBuilder xb = jsonBuilder() .startObject() .rawField(object_name, jsonStr.getBytes()) .endObject(); This code resulted in a malformed json document { , object_name : {field:value} }. Looking at the JsonXContentGenerator implementation of writeRawField it looks like this is what the code is expected to do, in fact: @Override public void writeRawField(String fieldName, byte[] content, OutputStream bos) throws IOException { generator.writeRaw(, \); generator.writeRaw(fieldName); generator.writeRaw(\ : ); flush(); bos.write(content); } In order to overcome this I had to add a dummy empty field to the document: XContentBuilder xb = jsonBuilder() .startObject() .field(dummy, ) .rawField(DOC_OBJECT_NAME, jsonStr.getBytes()) .endObject(); I wonder if there is any other method allowing to add a raw field only to a document using XContentBuilder or not. Thank you in advance, Andrea -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1eb49552-df19-4548-9696-534c1c3428b7%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Need help retrieving field from ES
Hi Karol Thanks for the reply. We have been left this ES setup by a previous member of staff. We create new Indexes every hour using the following the following perl statement Are you saying I have to add store = yes to keepalive? We don't do that for the other as you can see? create_index( index = $index, settings = { _timestamp = { enabled = 1, store = 1 }, number_of_shards = 3, number_of_replicas= 1, }, mappings = { varnish = { _timestamp = { enabled = 1, store = 1 }, properties = { content_length = { type = 'integer' }, age = { type = 'integer' }, keepalive = { type = 'integer' }, resp_time = { type = 'float' }, host= { type = 'string', index = 'not_analyzed' }, time= { type = 'string', store = 'yes' }, SNIPPED location= { type = 'string', index = 'not_analyzed' }, addr = { fields = { ip = { type = 'ip' }, addr = { type = 'string', index = 'not_analyzed' }, } }, } } }, ); I would no if the _source field has been disabled, how do I check? Does this help more: { _index: 2013122312 _type: log _id: Juh_YQJaT4GQ8Pjwk1bnqw _score: 1 _source: { protocol: HTTP/1.0 cdn: - vary: Accept-Encoding,ETag browser: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.1.1) GeckaSeka/20090911 Firefox/3.5.1 encoding: - location: - geo: US ref: - origin: - cookie: - uri: / cache_control: - content_length: 54053 userid: 0 age: 11556 resp_time: 0.000110149 method: GET accept: - ssl: 0 response_code: 200 accept_language: - varnstat: hit _src: log addr: 41.5.97.6 } } Do you need anymore information to help? Thanks again Nick On Tue, Dec 24, 2013 at 2:27 AM, Karol Gwaj ka...@gwaj.me wrote: keepalive field is stored in _source field (if you want to store it separately you have to add store : true to mapping) hard to tell more based on your example, also maybe you disabled _source field completely? On Monday, December 23, 2013 8:40:17 PM UTC, Nick Toseland wrote: Hi All I am new to ElasticSearch, please forgive my stupidity. I cant seem to get the keepalive field out of ES. { _index : lj-2013122320, _type : varnish, _id : Y1M18ZItTDaap_rOAS5YOA, _score : 1.0 } I can get other field out of it cdn: { _index : 2013122320, _type : log, _id : 2neLlVNKQCmXq6etTE6Kcw, _score : 1.0, fields : { cdn : - } } The mapping is there: {log:{_timestamp:{enabled:true,store:true},properties: {keepalive:{type:integer Any help is much appreciated. Thanks in advance Nick -- You received this message because you are subscribed to a topic in the Google Groups elasticsearch group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/roVCeLImQxs/unsubscribe. To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4ddd4df2-b2f6-4a3f-8ff5-5a0196f389d7%40googlegroups.com . For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAMWL0yPqpG6H5gC58UWZHFodwOV14WM4P_4LpovA_mTB%3DDkG2A%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
Setting path.data stops the EC2 stuff
Platform: ES 0.9.7 on Ubuntu 12.04 on AWS EC2 instance I now have the EC2 clustering stuff working on itself. What I'm trying to do now, is index on ephemeral storage (where I have space) and save the index to s3, and restore it on EC2 instance startup. Note: I am only running a single node in a single EC2 instance. Note2: I will try the EBS stuff also but right now I have an s3 bucket set up for me, and can experiment with it. I need someone with EC2 admin privileges to set up the EBS volume (...or so I think...?). My problem now is that if I set data.path to in /etc/elasticsearch/elasticsearch.yml to /mnt/elasticsearch (to get ephemeral storage, since my main problem here is running out of disk space on an EC2 instance that is started and stopped) then the EC2 discovery and S3 settings seem to be skipped. Here is the complete elasticsearch.yml file with keys XXX'd: https://gist.github.com/steinarb/8094353 Ie. if I don't set path.data then the file /var/log/elasticsearch/mysecretname-cluster.log is written to, and messages like this, is output: [2013-12-24 12:17:14,341][TRACE][discovery.ec2] [Foster, Bill] building dynamic unicast discovery nodes... [2013-12-24 12:17:14,342][DEBUG][discovery.ec2] [Foster, Bill] using dynamic discovery nodes [] ... [2013-12-24 12:17:17,715][DEBUG][gateway.s3 ] [Foster, Bill] reading state from gateway org.elasticsearch.gateway.shared.SharedStorageGateway$1@2eb9f428 ... If I do set data.path, /var/log/elasticsearch/elasticsearch.log is written to and no .ec2 and .s3 messages can be found in the log. (Once I get this working, one more problem will be to ensure that /mnt/elasticsearch is created and owned by user elasticsearch before ES is started.) -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/878uvaig05.fsf%40dod.no. For more options, visit https://groups.google.com/groups/opt_out.
Re: Aggregate vs facets vs nested documents ?
Hi, Your data model is assuming a 1-N relationship between transactions and worker events. There several ways that you can solve this kind of issues with Elasticsearch: - denormalization, - nested documents, - parent/child relations. Maybe the easiest way to do it would be to store data directly in the expected format. On a start event, you would insert a new transaction into your index using the transaction id as a document id, and later when the transaction ends, you could update the transaction to record the fact that the transaction finished and its duration. With this option, data is indexed in a way that is easily searchable and you'll be able to leverage all the power of aggregations facets to compute things like the number of non-terminated transactions, the average duration, etc. However, if you have a very high ingestion rate, this option might become a bit too slow, in which case you might want instead to record transactions as parent documents and events as child documents using parent/child relations. This will make indexing faster but require more memory and you'll have less power at query time. For example, finding transactions that started but didn't finished would require to find the start events, then resolve their transaction ids, then find the end events, resolve their transaction ids and finally to compute the difference of the two sets of transactions. This kind of query would typically execute much slower than if you already had all information in a single transaction document (as with the previous option). Moreover computing things like the average duration of transactions using facets or aggregations wouldn't be possible anymore with parent/child relations. See http://www.elasticsearch.org/blog/managing-relations-inside-elasticsearch/for more information. -- Adrien Grand -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7c9-M999xebmfMgrhv%2BpA51R1EeF0oTuZ5hXycf0y0DA%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: ElasticSearch and Oracle Java version 1.7 update 45
Hi, Indeed Lucene suffers from bugs in Oracle's Java 1.7u40 and 1.7u45. However, 1.7u25 should be safe. You can find information about these bugs in Lucene's JIRA and mailing-list. For example, here is one that affects Java 1.7u40: https://issues.apache.org/jira/browse/LUCENE-5212 -- Adrien Grand -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j6DpFhNkDAKQzSc5-e%3DktOcHKdWRSs6T86DLwb8N3ueAQ%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
Compared to Solr (with Solr Cloud), what is the advantage(s) of Elasticsearch?
I never used Apache Solr before, and I'm trying ElasticSearch in my project. The document of ES is a little scarce, but I have to explain to my supervisor why I chose ES over Solr. As far as I know, Solr (with Solr Cloud) also supports distributed indexing, near real-time update and searching, and automatic load balancing, which are the main features of ElasticSearch. What are the advantages of ES comparing to Apache Solr? Could anybody give me a tip, or some information links? Thanks a lot. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/44aa3f8d-59cb-4500-9b81-694718019057%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Compared to Solr (with Solr Cloud), what is the advantage(s) of Elasticsearch?
I would say: play with both for some hours. I really think you will get some answers by yourself! I don't want to say more than this as I have probably a biased opinion ;-) -- David Pilato | Technical Advocate | Elasticsearch.com @dadoonet | @elasticsearchfr Le 24 décembre 2013 at 15:16:59, Daniel Guo (daniel5...@gmail.com) a écrit: I never used Apache Solr before, and I'm trying ElasticSearch in my project. The document of ES is a little scarce, but I have to explain to my supervisor why I chose ES over Solr. As far as I know, Solr (with Solr Cloud) also supports distributed indexing, near real-time update and searching, and automatic load balancing, which are the main features of ElasticSearch. What are the advantages of ES comparing to Apache Solr? Could anybody give me a tip, or some information links? Thanks a lot. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/44aa3f8d-59cb-4500-9b81-694718019057%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/etPan.52b99df5.238e1f29.45b0%40MacBook-Air-de-David.local. For more options, visit https://groups.google.com/groups/opt_out.
Re: Compared to Solr (with Solr Cloud), what is the advantage(s) of Elasticsearch?
About six months ago I spent a week porting a prototype from Solr Cloud to Elasticsearch with the intent of evaluating Elasticsearch and either throwing out the port or building off of it. By the third day or so I was convinced I'd stick with Elasticsearch because: 1. I was impressed with http://www.elasticsearch.org/contributing-to-elasticsearch/. 2. The documentation is better. 3. I liked the query DSL better than solr's. 4. There is some http GET that you can hit in solr that will delete the index (or a shard or something). That shook my faith in humanity a little. Especially when I pasted it into IRC and my coworker clicked it or mouse overed it or something Gets. Idempotent. 5. I liked the phrase suggester. 6. My ops team seemed like it better. 7. There was (and still is) a deb package. 8. I liked the way Elasticsearch was tested. I admit I haven't actually looked into how Solr is tested. Since then: 1. I've enjoyed the process of landing changes in Elasticsearch much more then Lucene. I assume Solr would be the same because it is in the same repository as Lucene, The github process (pull request, etc) is better than JIRA/svn/patch files. I also think the Elasticsearch committers/repository collaborators are easier to work with then the Lucene folks. 2. The phrase suggester needed some work to be as good as our (surprisingly advanced) home grown suggester. It is now that good. 3. Elasticsearch has really improved the process of maintaining their documentation so I imagine it'll only get better. 4. It seems to be working. We're using 0.90.7 at this point (see https://en.wikisource.org/wiki/Special:Version) to power the search on a couple hundred wikis without any trouble. Try it: https://en.wikisource.org/w/index.php?search=aliastitle=Special%3ASearch Nik On Tue, Dec 24, 2013 at 9:45 AM, David Pilato da...@pilato.fr wrote: I would say: play with both for some hours. I really think you will get some answers by yourself! I don't want to say more than this as I have probably a biased opinion ;-) -- *David Pilato* | *Technical Advocate* | *Elasticsearch.com* @dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr Le 24 décembre 2013 at 15:16:59, Daniel Guo (daniel5...@gmail.com//daniel5...@gmail.com) a écrit: I never used Apache Solr before, and I'm trying ElasticSearch in my project. The document of ES is a little scarce, but I have to explain to my supervisor why I chose ES over Solr. As far as I know, Solr (with Solr Cloud) also supports distributed indexing, near real-time update and searching, and automatic load balancing, which are the main features of ElasticSearch. What are the advantages of ES comparing to Apache Solr? Could anybody give me a tip, or some information links? Thanks a lot. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/44aa3f8d-59cb-4500-9b81-694718019057%40googlegroups.com . For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/etPan.52b99df5.238e1f29.45b0%40MacBook-Air-de-David.local . For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPmjWd0twn4ywTg5wUS44F_otxHCVcYT2vHjG8%2B0DS_PXHF_TQ%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Possible to make ES Node Name same as Hostname?
Hello Jorg, I was not aware of this page, but after skimming the page it is not what I need in this situation. The majority of the commands on this page are invoked as attributes of systemctl which is the command line tool to inspect, modify and manage systemd in a running system. I'm not sure why yet some commands are included that AFAIK are more appropriately invoked within a Unit config file, but maybe those commands can be invoked in both situations. But, since these systemd environment commands are to be run from within systemctl, they're basically ways to modify a running environment interactively, they're not the way to setup and modify the environment on bootup. In systemd, environment variables can be setup during boot, primarily in the .target Unit files, but because systemd is uniquely fully compatible with other Linux subsystems, almost all the traditional ways to setup, modify and run is supported. In openSUSE' case, it's migrating from the well known SystemVinit subsystem, so until legacy init and bash ways of invoking code are replaced, they are a perfectly legitimate way of doing things, still. So, that is why in my previous post I described how I traditionally used the bash script way of creating an environment variable (/etc/profile.local) and then tested to make sure it works... So that doesn't appear to be more problem. Instead, I believe the problem sounds explicitly Elasticsearch code when the following very specific error was returned elasticsearch.service: main process exited, code=exited, status=3/NOTIMPLEMENTED Could have been any error, but one so specific? Tony On Monday, December 23, 2013 2:54:35 PM UTC-8, Jörg Prante wrote: Note, if you are using systemd, you must set environment vars with systemctl http://www.freedesktop.org/software/systemd/man/systemctl.html Not sure what -Des commands are. If you mean the elasticsearch command line, many ES config variables can be prefixed with es., the -D flag is Java. Jörg -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2c3eab8b-8446-4d57-b3f2-c0d10eef79bb%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: How can I merge the results of two aggregations?
You can use scripts to run aggregations on the union of two fields. Here is an example: GET /test/_search { aggregations: { name : { terms : { script: _doc['firstname'].values + _doc['lastname'].values } } } } Another alternative would be to index both first names and last names in a single 'name' field and to run the aggregation on this field at search time. Although this would require more disk space, this would also be faster. On Fri, Dec 20, 2013 at 10:53 AM, Tim S timsti...@gmail.com wrote: Sorry, maybe I'm missing something. Afaics the OR filter would operate on filters or queries, not on aggregations. Can you give me an example of how I'd use this to merge the result of the two aggregations? Thanks. On Friday, December 20, 2013 2:54:14 AM UTC, kidkid wrote: Hi sorry, it would be my mistake: Could you take a look at OrFilter: http://www.elasticsearch.org/guide/en/ elasticsearch/reference/current/query-dsl-or-filter.html In your case I think you could use match all query use OrFilter let ES merge the result. On Thursday, December 19, 2013 2:29:06 AM UTC-8, Tim S wrote: How can I use bool query on the result of an aggregation? I want both aggregations to independently facet on the whole index, then merge the results. I can see how I would use a bool query to limit the set of docs I'm aggregating on, but I can't see how I would use it to merge the results of two aggregations? Thanks, Tim. On Wednesday, December 18, 2013 4:40:37 PM UTC, kidkid wrote: You could use bool query and let ElasticSearch do the rest. http://www.elasticsearch.org/guide/en/elasticsearch/ reference/current/query-dsl-bool-query.html On Wednesday, December 18, 2013 3:58:07 PM UTC+7, Tim S wrote: In the example below, I ask elasticsearch for two aggregations (I've simplified it, it's actually got some nested aggregations in there). { aggregations: { agg1: { terms: { field: forname } }, agg2: { terms: { field: surname } } } } What I get back is two sets of results, i.e. { aggregations : { agg1 : { buckets : [ { key : john, doc_count : 1 }, { key : bob, doc_count : 4 } ] }, agg2 : { buckets : [ { key : smith, doc_count : 3 }, { key : jones, doc_count : 2 } ] } } } What I'd like to get back is one set of results. I.e. a list of terms appearing in either of the fields, with the counts summed across both, e.g. { aggregations : { agg1 OR agg2 : { buckets : [ { key : john, doc_count : 1 }, { key : bob, doc_count : 4 }, { key : smith, doc_count : 3 }, { key : jones, doc_count : 2 } ] } } } Is there any way of doing this? I could request them and merge them in my own code, but if there's a built in way then I'd rather use that. Thanks, Tim. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/067e2d3e-2d7f-46e9-9ce6-0384a9a89b43%40googlegroups.com . For more options, visit https://groups.google.com/groups/opt_out. -- Adrien Grand -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j4VLK-NQbREqtTwVeDC%3DrTgTcWfyxjeWmF_uL%3DeJsus6A%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
Completion suggester with separated doc types (multiple indices?)
Hello, I've started playing around with the completion suggester to implement autocomplete functionality in my application and found it pretty straightforward and simple to use. The problem I have is that I don't want mixed types in my suggestions: if have a music index with song and artist types, I want to have song and artist autocompletes. From the documentation I get that this doesn't seem to be supported, so I'm considering having a separate index for each type. I've readhttp://elasticsearch-users.115913.n3.nabble.com/More-indices-vs-more-types-td3999423.html#a4002051that this is not the best practice, but from the size of my data I presume I won't be having problems: I have around 10 types, most of them with around 2000 documents. It's worth noting that I'm just using elasticsearch for this autocomplete functionality (although I may use it for regular search eventually). So I wanted to check if it makes sense modeling my indices that way or if there's an alternative solution to my problem (for example, using the suggest plugin https://github.com/spinscale/elasticsearch-suggest-plugin?) Thanks, Facundo. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/65d950a0-0784-46eb-94c3-7689ab858fe2%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Possible to make ES Node Name same as Hostname?
OK, I ran each of your tests, but because I'm invoking elasticsearch as a service and not from the CLI, I interactively ran the command you wanted prepended, then started the elasticsearch service, then reloaded elasticsearch-head pointing to the elasticsearch node. So, as follows, I first disabled the service so it doesn't start automatically. After each block below I ran elasticsearch-head with the same results, a random friendly node name was created in elasticsearch-head # echo $NAME /bin/hostname # $NAME ELASTICSEAR-1 # systemctl stop elasticsearch.service # export NAME=hostname # systemctl start elasticsearch.service # systemtcl stop elasticsearch.service # export NAME=$HOSTNAME # systemctl start elasticsearch.service # systemctl stop elasticsearch.service # export NAME=$hostname # systemctl start elasticsearch.service For your reference is the contents of the elasticsearch.service Unit file (aka service configuration file). Although it references many exterior files (the -Des commands), I doubt in this case anything in them are likely to be relevant because we seem to be dealing with a variable (if supported) is a null value. [Unit] Description=Starts and stops a single elasticsearch instance on this system Documentation=http://www.elasticsearch.org [Service] Type=forking EnvironmentFile=/etc/sysconfig/elasticsearch User=elasticsearch Group=elasticsearch PIDFile=/var/run/elasticsearch/elasticsearch.pid ExecStart=/usr/share/elasticsearch/bin/elasticsearch -p /var/run/elasticsearch/elasticsearch.pid -Des.default.config=$CONF_FILE -Des.default.path.home=$ES_HOME -Des.default.path.logs=$LOG_DIR -Des.default.path.data=$DATA_DIR -Des.default.path.work=$WORK_DIR -Des.default.path.conf=$CONF_DIR # See MAX_OPEN_FILES LimitNOFILE=65535 # See MAX_LOCKED_MEMORY, use infinity when MAX_LOCKED_MEMORY=unlimited and using bootstrap.mlockall: true #LimitMEMLOCK=infinity [Install] WantedBy=multi-user.target A thought, if you're running the same version as me, I wonder if there might be a difference between the RPM build (which I am using) and yours which of course is a DEB build unless you built from source. I'm using the RPM downloaded directly through the Elasticsearch website. Tony On Monday, December 23, 2013 6:16:45 PM UTC-8, Karol Gwaj wrote: Im using elasticsearch version 0.90.7 (which is this same as the one you mentioned in your question) Also im running my elasticsearch cluster on ubuntu, so my example was more suited for this linux distribution and coming back to your problem, to diagnose it you can: - add *echo $NAME* statement at the beginning of bin/elasticsearch script (if nothing is printed then your environment variable is not declared correctly) - add *export NAME=`hostname`* at the beginning of bin/elasticsearch script - add *export NAME=$HOSTNAME* at the beginning of bin/elasticsearch script can you try the steps above first (one at the time), before defining your environment variable in /etc/profile.local Cheers, On Monday, December 23, 2013 9:52:40 PM UTC, Tony Su wrote: After considering this post, I successfully created an environmental variable by adding to the bash profile file (actually on the openSUSE I'm running, I created a file /etc/profile.local which contains system customizations, the original /etc/profile should not be edited). BTW - on a non-Windows box, hostname must be in lower case, not upper case. /etc/profile.local export NAME=/bin/hostname After running source /etc/profile.local to activate the contents of the file I can successfully test the new variable, it does return the machine's hostname. $NAME But, when I modify the elasticsearch.yml file exactly as described *node.name* http://node.name/: ${NAME} The result is that the elasticsearch service fails to start with the following error: ELASTICSEAR-1 systemd[1]: Starting Starts and stops a single elasticsearch instance on this system... Dec 23 13:23:18 ELASTICSEAR-1 systemd[1]: PID file /var/run/elasticsearch/elasticsearch.pid not readable (yet?) after start. Dec 23 13:23:43 ELASTICSEAR-1 systemd[1]: Started Starts and stops a single elasticsearch instance on this system. Dec 23 13:23:43 ELASTICSEAR-1 systemd[1]: elasticsearch.service: main process exited, code=exited, status=3/NOTIMPLEMENTED Dec 23 13:23:43 ELASTICSEAR-1 systemd[1]: Unit elasticsearch.service entered failed state. I also tried without the curly braces, but then the string is read literally and not as a variable. Commenting out the attempt to set the node name to the hostname allows the elasticsearch service to start again. From the above error(not implemented), is it possible that the current stable elasticsearch release does not support your recommendation and I need to maybe install an unstable version? Thx, Tony On Sunday, December 22, 2013 4:53:48 PM UTC-8, Karol Gwaj
Reports and Notifications.
We have a HUGE splunk install and constantly are running into our limits. We have decided to go with a tiered solution using kibana+logstash+elasticsearch. The one thing that we really need is a way to have reports generated like we can with splunk. Does anyone know if there is a plugin or third party app to do this sort of thing? Thanks, CP -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4ec90c41-bde7-406c-9d15-ae7e28115add%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Possible to make ES Node Name same as Hostname?
yep, my setup is a little bit different from yours i actually downloaded elasticsearch as tar.gz file probably the problem you are experiencing is related more to the fact that systemctl is not passing environment variables to executed script i had similar problem with upstart (which is equivalent of systemctl on ubuntu) from what i know about systemctl, the way to define environment variables that will be passed to executed script is through EnvironmentFile so maybe try to define NAME environment variable in */etc/sysconfig/elasticsearch* (EnvironmentFile ) from your service unit file i see that elasticsearch startup script is located in: */usr/share/elasticsearch/bin/elasticsearch* if you add definition of your environment variable at the beginning of this file then you will not have to worry about systemctl not passing environment variables to your script so try something like that first: export NAME=$HOSTNAME echo $NAME # this should print your hostname /usr/share/elasticsearch/bin/elasticsearch On Tuesday, December 24, 2013 4:24:02 PM UTC, Tony Su wrote: OK, I ran each of your tests, but because I'm invoking elasticsearch as a service and not from the CLI, I interactively ran the command you wanted prepended, then started the elasticsearch service, then reloaded elasticsearch-head pointing to the elasticsearch node. So, as follows, I first disabled the service so it doesn't start automatically. After each block below I ran elasticsearch-head with the same results, a random friendly node name was created in elasticsearch-head # echo $NAME /bin/hostname # $NAME ELASTICSEAR-1 # systemctl stop elasticsearch.service # export NAME=hostname # systemctl start elasticsearch.service # systemtcl stop elasticsearch.service # export NAME=$HOSTNAME # systemctl start elasticsearch.service # systemctl stop elasticsearch.service # export NAME=$hostname # systemctl start elasticsearch.service For your reference is the contents of the elasticsearch.service Unit file (aka service configuration file). Although it references many exterior files (the -Des commands), I doubt in this case anything in them are likely to be relevant because we seem to be dealing with a variable (if supported) is a null value. [Unit] Description=Starts and stops a single elasticsearch instance on this system Documentation=http://www.elasticsearch.org [Service] Type=forking EnvironmentFile=/etc/sysconfig/elasticsearch User=elasticsearch Group=elasticsearch PIDFile=/var/run/elasticsearch/elasticsearch.pid ExecStart=/usr/share/elasticsearch/bin/elasticsearch -p /var/run/elasticsearch/elasticsearch.pid -Des.default.config=$CONF_FILE -Des.default.path.home=$ES_HOME -Des.default.path.logs=$LOG_DIR -Des.default.path.data=$DATA_DIR -Des.default.path.work=$WORK_DIR -Des.default.path.conf=$CONF_DIR # See MAX_OPEN_FILES LimitNOFILE=65535 # See MAX_LOCKED_MEMORY, use infinity when MAX_LOCKED_MEMORY=unlimited and using bootstrap.mlockall: true #LimitMEMLOCK=infinity [Install] WantedBy=multi-user.target A thought, if you're running the same version as me, I wonder if there might be a difference between the RPM build (which I am using) and yours which of course is a DEB build unless you built from source. I'm using the RPM downloaded directly through the Elasticsearch website. Tony On Monday, December 23, 2013 6:16:45 PM UTC-8, Karol Gwaj wrote: Im using elasticsearch version 0.90.7 (which is this same as the one you mentioned in your question) Also im running my elasticsearch cluster on ubuntu, so my example was more suited for this linux distribution and coming back to your problem, to diagnose it you can: - add *echo $NAME* statement at the beginning of bin/elasticsearch script (if nothing is printed then your environment variable is not declared correctly) - add *export NAME=`hostname`* at the beginning of bin/elasticsearch script - add *export NAME=$HOSTNAME* at the beginning of bin/elasticsearch script can you try the steps above first (one at the time), before defining your environment variable in /etc/profile.local Cheers, On Monday, December 23, 2013 9:52:40 PM UTC, Tony Su wrote: After considering this post, I successfully created an environmental variable by adding to the bash profile file (actually on the openSUSE I'm running, I created a file /etc/profile.local which contains system customizations, the original /etc/profile should not be edited). BTW - on a non-Windows box, hostname must be in lower case, not upper case. /etc/profile.local export NAME=/bin/hostname After running source /etc/profile.local to activate the contents of the file I can successfully test the new variable, it does return the machine's hostname. $NAME But, when I modify the elasticsearch.yml file exactly as
Re: Completion suggester with separated doc types (multiple indices?)
Hey, the way the completion suggester is implemented, it does not support filtering by types (neither does the suggest plugin) - so this makes a pretty clear decision process for your use-case. The reason for this, is the different approach how suggest data is stored and queried - in a nutshell, the suggest data structure simply takes the whole index data and uses it for suggestions. The type itself is simply spoken just another metadata, which cannot be filtered out. I'd go with the completion suggester if possible (as I wrote the suggest plugin I can tell that the completion suggester has a way better design, and, obviously, is part of the core). Hope this helps... --Alex On Tue, Dec 24, 2013 at 4:50 PM, Facundo Olano facundo.ol...@gmail.comwrote: Hello, I've started playing around with the completion suggester to implement autocomplete functionality in my application and found it pretty straightforward and simple to use. The problem I have is that I don't want mixed types in my suggestions: if have a music index with song and artist types, I want to have song and artist autocompletes. From the documentation I get that this doesn't seem to be supported, so I'm considering having a separate index for each type. I've readhttp://elasticsearch-users.115913.n3.nabble.com/More-indices-vs-more-types-td3999423.html#a4002051that this is not the best practice, but from the size of my data I presume I won't be having problems: I have around 10 types, most of them with around 2000 documents. It's worth noting that I'm just using elasticsearch for this autocomplete functionality (although I may use it for regular search eventually). So I wanted to check if it makes sense modeling my indices that way or if there's an alternative solution to my problem (for example, using the suggest plugin https://github.com/spinscale/elasticsearch-suggest-plugin?) Thanks, Facundo. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/65d950a0-0784-46eb-94c3-7689ab858fe2%40googlegroups.com . For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGCwEM9E1D57m%2B1d8HM2jfyawo3COuadybgds2LhzoQv3Lw86w%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Nodes are not able to connect to the master
Alex - So if I were to look for a right suggestor which works on a part of index (filtered) and not the entire index, is edge N-gram the best choice? -Amit. On Tue, Dec 24, 2013 at 4:21 PM, Alexander Reelsen a...@spinscale.de wrote: Hey, connection timed out means, that the other host is not reachable. This can have dozens of reasons. If the node never joins the cluster, you might have a firewall problem. If the node already had joined the cluster, your might have a temporary network outage, or maybe your node is under an extremely high load. Without proper monitoring and more digging through the log files this is really hard to tell. A first try might be, if you are able to reach that port manually on that host - completely independent from elasticsearch itself. If this does not work, you got other problems. --Alex On Tue, Dec 24, 2013 at 8:33 AM, deep saxena sandy100s...@gmail.comwrote: org.elasticsearch.transport.ConnectTransportException: [Crooked Man][inet[/192.168.202.1:9300]] connect_timeout[30s] at org.elasticsearch.transport.netty.NettyTransport.connectToChannels(NettyTransport.java:671) at org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:610) at org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:580) at org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:127) at org.elasticsearch.cluster.service.InternalClusterService$2.run(InternalClusterService.java:300) at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:95) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) at java.lang.Thread.run(Thread.java:662) Caused by: java.net.ConnectException: Connection timed out: no further information: /192.168.202.1:9300 at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) at org.elasticsearch.common.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:150) at org.elasticsearch.common.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:105) at org.elasticsearch.common.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:79) at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312) at org.elasticsearch.common.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42) at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) .yaml file contain the default settings, no changes in that, still not connecting. Is there any issue with the network setting or the elasticsearch configuration? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c1225b66-2b1e-48e9-941e-23c9fb181328%40googlegroups.com . For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGCwEM-EdjRXhqeyao_Fz4zWyd9Wsftbg_kfqu%3DWQE2Z_%3DE%2B5Q%40mail.gmail.com . For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAAOGaQKpJHhyQ2z%3DePZbLe%2BCVhUe-HZ39Wt-7UZJnkg7RZG3ww%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Reports and Notifications.
In splunk there is the ability to create a report or notification if a certain threshold/event/trigger is captured. Then a script or report is triggered/sent. We are looking for simuliar functionality. I will check out what you sent. On Dec 24, 2013 7:31 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi, Could you please describe what you mean by reports? Are you looking for daily/weekly email with graphs or something else? We have that in SPM (monitoring) and Logsene (log analytics) is getting it, too. Kibana has this as well via phantomjs, I believe, though I'm not sure how/if it's hooked up to email. Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Tuesday, December 24, 2013 2:53:22 PM UTC-5, CP wrote: We have a HUGE splunk install and constantly are running into our limits. We have decided to go with a tiered solution using kibana+logstash+elasticsearch. The one thing that we really need is a way to have reports generated like we can with splunk. Does anyone know if there is a plugin or third party app to do this sort of thing? Thanks, CP -- You received this message because you are subscribed to a topic in the Google Groups elasticsearch group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/CrmmeHqa-HY/unsubscribe. To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4b231a64-52b4-4c23-b87c-bd3680ac44ac%40googlegroups.com . For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CANVfK9LeVrGi_p2QtD3Kh5jY1CDgqwaOTfs5s3sh%2B%2BVWAWw6Aw%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Several questions on ES in production environment
Hi, On Tuesday, December 24, 2013 8:47:54 AM UTC-5, Han JU wrote: Hi, We're approaching the first release of our product and we use ElasticSearch as a key component in our system. But there's still some questions and doubts so I'd like to listen to the more experienced users and ElasticSearch folks here. 1. We use ElasticSearch as a search tool but also the storage of all documents. It means that the front-end retrieves fields from ES just as if it's a database. We've already disable the index (index: no) on the fields that don't need to be searched (list of ids etc.) but is this a good usage of ElasticSearch? Given that we expected to have ~ 1 billion documents (~ 1.4kb each) in our first 3 months in a single index. 1.4KB is pretty small, so that's fine. Often keeping it all in ES is simpler - doesn't require another hope to another server (e.g. a DB) to retrieve display data, there is one moving piece fewer, which makes everything simple. I'd keep your display data in ES and worry about changing it later IFF you have issues. 2. We will use thrift to push documents in production because we've seen a performance gain. Is there any downside of using thrift over plain json? 3. Some of our queries uses regexp filter. In my comprehension this needs to load the target field of every document to see if it matches, so it's pretty costly for an index of 1 billion docs? Yes, regexps are not the fastest. What are you trying to do that requires regexp filter? Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f26e7656-61cf-4609-8182-d7c6d406a5cc%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.