Re: Searching indexed fields without analysing
Hi, a bit more information. I tried adding a custom analyzer based off a recommendation I saw online somewhere. This partly works in that it's not tokenising. But I can't do wildcard searches in Kibana on the fields, and they're now case sensitive :( curl localhost:9200/_template/logstash-username -XPUT -d '{ template: logstash-*, settings : { analysis: { analyzer: { lc_analyzer: { type: custom, tokenizer: keyword, filters: [lowercase] } } } }, mappings: { _default_: { properties : { User_Name : { type : string, analyzer : lc_analyzer } } } } }' Thanks On Wednesday, January 8, 2014 3:26:03 PM UTC, Chris H wrote: Hi. I've deployed elasticsearch with logstash and kibana to take in Windows logs from my OSSEC log server, following this guide: http://vichargrave.com/ossec-log-management-with-elasticsearch/ I've tweaked the logstash config to extract some specific fields from the logs, such as User_Name. I'm having some issues searching on these fields though. These searches work as expected: - User_Name: * - User_Name: john.smith - User_Name: john.* - NOT User_Name: john.* But I'm having problems with Computer accounts, which take the format w-dc-01$ - they're being split on the - and the $ is ignored. So a search for w-dc-01 returns all the servers named w-anything. Also I can't do NOT User_Name: *$ to exclude computer accounts. The mappings are created automatically by logstash, and GET /logstash-2014.01.08/_mapping shows: User_Name: { type: multi_field, fields: { User_Name: { type: string, omit_norms: true }, raw: { type: string, index: *not_analyzed*, omit_norms: true, index_options: docs, include_in_all: false, ignore_above: 256 } } }, My (limited) understanding is that the not_analyzed should stop the field being split, so that my searching matches the full name, but it doesn't. I'm trying both kibana and curl to get results. Hope this makes sense. I really like the look of elasticsearch, but being able to search on extracted fields like this is pretty key to me using it. Thanks. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/96e74e53-54f9-48ec-9e5c-8f1354b264be%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: logstash vs rivers for reading data from SQL Server
Hey, maybe you should ask your developers, why they recommended logstash for this, I cant follow here (perhaps there is some export functionality in your SQL server, which an input of logstash can use?). Would be interested in reasons in this case. --Alex On Wed, Jan 8, 2014 at 5:26 PM, jsp jayasunde...@gmail.com wrote: Hi, I am looking at implementing ES to index query data that I get from my SQL Server databases/tables. I was initially using river to read data from Sql server tables but one of the developers in my team recommended looking at using logstash. Can anyone comment to any benefits of using one over another? I have not been able to find any documentation regarding reading data from SQL server using logstash http://logstash.net/docs/1.3.2/. Can someone point me to a guide on how to get started with logstash sql server. Thanks J -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1173b62c-afd3-4d2d-9a3f-ba423ed7ede4%40googlegroups.com . For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGCwEM-X070J9GKmx7ysce1KCnX5yinMoeDhkkwHfDV%3D_BqDwA%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Elasticsearch Missing Data
Hey, a couple of things: 1. Did you check the log files? Most likely in /var/log/elasticsearch if you use the packages. Is there anything suspicious at the time of your outage? Please check your master node as well, if you have one (not sure if it is a master or client node from the cluster health). 2. Why should elasticsearch pull your data? Any special configuration you didnt mention? Or what exactly do you mean here? 3. Happy to debug your issue with the init script. The elasticsearch.yml file should be in /etc/elasticsearch/ and not in /etc - anything manually moved around? Can you still reproduce it? --Alex On Wed, Jan 8, 2014 at 8:10 PM, Eric Luellen eric.luel...@gmail.com wrote: Hello, I've had my elasticsearch instance running for about a week with no issues, but last night it stopped working. When I went to look in Kibana, it stops logging around 20:45 on 1/7/14. I then restarted the service on both both elasticsearch servers and it started logging again and back pulled some logs from 07:10 that morning, even though I restarted the service around 10:00. So my questions are: 1. Why did it stop working? I don't see any obvious errors. 2. When I restarted it, why didn't it go back and pull all of the data and not just some of it? I see that there are no unassigned shards. curl -XGET 'http://localhost:9200/_cluster/health?pretty=true' { cluster_name : my-elasticsearch, status : green, timed_out : false, number_of_nodes : 3, number_of_data_nodes : 2, active_primary_shards : 40, active_shards : 80, relocating_shards : 0, initializing_shards : 0, unassigned_shards : 0 Are there any additional queries or logs I can look at to see what is going on? On a slight side note, when I restarted my 2nd elasticsearch server it isn't reading from the /etc/elasticsearch.yml file like it should. It isn't creating the node name correctly or putting the data files in the spot I have configured. I'm using CentOS and doing everything via /etc/init.d/elasticsearch on both servers and the elasticsearch1 server reads everything correctly but elasticsearch2 does not. Thanks for your help. Eric -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/fc191ee4-b312-4c52-89d9-de04c4309b65%40googlegroups.com . For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGCwEM8EOWdC5esVkfZ5hogocQkgreJBQUbF2zE7s-gGCt4NdQ%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Kibana Static Dashboard ?
Then you can put it in $KIBANA_ROOT/app/dashboards and load it from there. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 9 January 2014 19:32, vineeth mohan vm.vineethmo...@gmail.com wrote: Hello Jay , An advice here. You can save Kibana dashboard as a static file too. Follow these steps Save - Advanced - Export as schema Thanks Vineeth On Thu, Jan 9, 2014 at 3:16 AM, Jay Wilson jawro...@gmail.com wrote: As I understand Kibana when a dashboard is saved, it is placed into elasticsearch. I don't want it in elasticsearch. I want it in a static file. On Wednesday, January 8, 2014 2:32:50 PM UTC-7, vineeth mohan wrote: Hello Jay , Cant you do the same from the kibana side by adding a must not filter. Here once you save that dashboard , you can always go back to the same link to see the same static dashboard. Thanks Vineeth On Thu, Jan 9, 2014 at 2:42 AM, Jay Wilson jawr...@gmail.com wrote: I am modifying the guided.json dashboard. Down in Events panel I would like to tell kibana to statically filter out specific records. I tried adding this to the file. query: { filtered: { query: { bool: { should: [ { query_string: { query: record-type: traffic-stats } } ] } } } }, Doesn't appear to work. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/83bd80b2-5a61-4a15-b359-125fd600f3cd% 40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f207abdd-9fce-4379-aa9a-dd1dd35aa398%40googlegroups.com . For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGdPd5k0zxt4G%2BWquQt8-iwY%2Bk_kwWgcOrGpB3Yd0%2BOkfr6fvg%40mail.gmail.com . For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624Z4jBX6Yi5GLpLMBH0zCJh6RsRqWXSSb9-4m7%3DrUdOqTQ%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: allow_explicit_index and _bulk
Hey, after having a very quick look, it looks like a bug (or wrong documentation, need to check further). Can you create a github issue? Thanks! --Alex On Wed, Jan 8, 2014 at 11:08 PM, Gabe Gorelick-Feldman gabegorel...@gmail.com wrote: The documentation on URL-based access controlhttp://www.elasticsearch.org/guide/en/elasticsearch/reference/current/url-access-control.html implies that _bulk still works if you set rest.action.multi.allow_explicit_index: false, as long as you specify the index in the URL. However, I can't get it to work. POST /foo/bar/_bulk { index: {} } { _id: 1234, baz: foobar } returns explicit index in bulk is not allowed Should this work? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a0d1fa2f-0c28-4142-9f6d-4b28a1695bb3%40googlegroups.com . For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGCwEM990rKacVv7DQ6eeJRciwLwGRiA8OezUYs8xqE17vrGgA%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
How to configure and implement Synonyms with multi words.
Hi, I have following Synonyms that I want to configure. software engineer = software engineer, se, senior software engineer = senior software engineer , see team lead = team lead, lead, tl So that If I searched for se or Software Engineer it should return me the records having software engineer. What mapping I should apply on Designation field? and what query I should fire to get the result It is possible to use multi_match query? Following are the query to create the records. curl -XPUT 'http://localhost:9200/employee/test/1?pretty' -d '{designation: software engineer}' curl -XPUT 'http://localhost:9200/employee/test/2?pretty' -d '{designation: software engineer}' curl -XPUT 'http://localhost:9200/employee/test/3?pretty' -d '{designation: senior software engineer}' curl -XPUT 'http://localhost:9200/employee/test/4?pretty' -d '{designation: senior software engineer}' curl -XPUT 'http://localhost:9200/employee/test/5?pretty' -d '{designation: team lead}' curl -XPUT 'http://localhost:9200/employee/test/6?prettyrefresh=true' -d '{designation: team lead}' -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/486c56cf-11a3-4b14-b5b4-66e4a465cb13%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Spring elastic search and configuration for Mappings and _settings files
Your configuration looks good to me. I modified your spring file to add a node and change server location: elasticsearch:node properties=esProperties name=node / elasticsearch:client id=esClient2 mappings=experiment2/NewTitles esNodes=localhost:9300 forceMapping=true properties=esProperties/ I started you main() and the factory starts as expected. No error seen. Not sure where your issue came from. -- David Pilato | Technical Advocate | Elasticsearch.com @dadoonet | @elasticsearchfr Le 6 janvier 2014 at 15:18:42, Ramdev Wudali (agasty...@gmail.com) a écrit: Hi David : Sorry for the delay in my response.. (the weekend chores took over). Here is my project (a tgz archive file) its a maven project so you should due able to import it into your IDE of choice (I have used IntelliJ, So you may find some of those artifacts as well). I have not included any data. (the data format is just Strings(titles) one per line). The path is specified in the spring config file. that is included in the resources folder. Please do let me know if you do find something… Thanks Ramdev See my vizify bio! On Fri, Jan 3, 2014 at 3:20 PM, David Pilato da...@pilato.fr wrote: Could you share your project or gist your files and source code? -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 3 janv. 2014 à 22:08, Ramdev Wudali agasty...@gmail.com a écrit : Hi David: I setup the config to run on port 8200 and 8300 (instead of default 9200 and 9300 as they were taken up by tomcat) See my vizify bio! On Fri, Jan 3, 2014 at 2:38 PM, David Pilato da...@pilato.fr wrote: Is it a typo? esNodes=elasticsearch.server:8300 Should be 9300, right? -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 3 janv. 2014 à 21:35, Ramdev Wudali agasty...@gmail.com a écrit : Hi David: Thanks for the speedy response. Here is an update to my problem. I was trying to create a different type within the same index. (Index: experiment, type : Titles and I was trying to create Type : NewTitles ) I am not sure if this has any bearing on the problem. After posting the question on the group, I went ahead and created a separate index (experiment2) and within this new index, I created the Type: NewTitles. When I ran my application, there was no problems during the Spring elastic search client initialization. This basically tells me there is a conflict in creation of a new Type under an existing index. (I am not able to figure out why there is a conflict). And I am not mixing versions of ElasticSearch between client and node. (both using 0.90.5) hope this helps Thanks Ramdev See my vizify bio! On Fri, Jan 3, 2014 at 2:29 PM, David Pilato da...@pilato.fr wrote: Any chance you are mixing elasticsearch versions between node and client? -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 3 janv. 2014 à 20:16, Ramdev Wudali agasty...@gmail.com a écrit : Hi All: I am trying to index a set of documents with the following mapping : { NewTitles: { properties: { DOC_ID: { type:string }, TITLE: { type: multi_field, fields : { TITLE : { type : string }, sortable : { type : string, index : not_analyzed }, autocomplete : { type : string, index_analyzer : shingle_analyzer } } } } } } (which resides in the src/main/es/experiment folder in my project) and there is a _settings.json file which defines the shingle_analyzer like so : { index : { analysis: { filter: { shingle_filter: { type: shingle, min_shingle_size: 2, max_shingle_size: 5 } }, analyzer: { shingle_analyzer: { type: custom, tokenizer: standard, filter: [ lowercase, shingle_filter ] } } } } } I am initializing the Elasticsearch client using the spring elastic search like so : util:map id=esProperties entry key=cluster.name value=elasticsearch-experiment / /util:map elasticsearch:client id=esClient2 mappings=experiment/NewTitles esNodes=elasticsearch.server:8300 forceMapping=true properties=esProperties/ The elastic Search instance already has the index : experiment and Type : Titles When I run my app to index some new content, I get the an error during Spring
Re: Searching indexed fields without analysing
Hi Chris, Could you try to escape “-“ in query for “not_analyzed” field? http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html#_reserved_characters I hope this helps. Regards, Jun Ohtani joht...@gmail.com blog : http://blog.johtani.info twitter : http://twitter.com/johtani 2014/01/09 17:20、Chris H chris.hemb...@gmail.com のメール: Hi, a bit more information. I tried adding a custom analyzer based off a recommendation I saw online somewhere. This partly works in that it's not tokenising. But I can't do wildcard searches in Kibana on the fields, and they're now case sensitive :( curl localhost:9200/_template/logstash-username -XPUT -d '{ template: logstash-*, settings : { analysis: { analyzer: { lc_analyzer: { type: custom, tokenizer: keyword, filters: [lowercase] } } } }, mappings: { _default_: { properties : { User_Name : { type : string, analyzer : lc_analyzer } } } } }' Thanks On Wednesday, January 8, 2014 3:26:03 PM UTC, Chris H wrote: Hi. I've deployed elasticsearch with logstash and kibana to take in Windows logs from my OSSEC log server, following this guide: http://vichargrave.com/ossec-log-management-with-elasticsearch/ I've tweaked the logstash config to extract some specific fields from the logs, such as User_Name. I'm having some issues searching on these fields though. These searches work as expected: • User_Name: * • User_Name: john.smith • User_Name: john.* • NOT User_Name: john.* But I'm having problems with Computer accounts, which take the format w-dc-01$ - they're being split on the - and the $ is ignored. So a search for w-dc-01 returns all the servers named w-anything. Also I can't do NOT User_Name: *$ to exclude computer accounts. The mappings are created automatically by logstash, and GET /logstash-2014.01.08/_mapping shows: User_Name: { type: multi_field, fields: { User_Name: { type: string, omit_norms: true }, raw: { type: string, index: not_analyzed, omit_norms: true, index_options: docs, include_in_all: false, ignore_above: 256 } } }, My (limited) understanding is that the not_analyzed should stop the field being split, so that my searching matches the full name, but it doesn't. I'm trying both kibana and curl to get results. Hope this makes sense. I really like the look of elasticsearch, but being able to search on extracted fields like this is pretty key to me using it. Thanks. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/96e74e53-54f9-48ec-9e5c-8f1354b264be%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out. signature.asc Description: Message signed with OpenPGP using GPGMail
Re: How to configure and implement Synonyms with multi words.
Also I have another scenario where my index is having words like software engineer, se, --- this should get seached when I do search on Software engineer team lead, lead, tl --- this should get seached when I do search on Team Lead Following are the query to create the records. curl -XPUT 'http://localhost:9200/employee/test/11?prettyhttp://localhost:9200/employee/test/1?pretty' -d '{designation: software engineer}' curl -XPUT 'http://localhost:9200/employee/test/12?prettyhttp://localhost:9200/employee/test/2?pretty' -d '{designation: se}' curl -XPUT 'http://localhost:9200/employee/test/13?prettyhttp://localhost:9200/employee/test/3?pretty' -d '{designation: sse}' curl -XPUT 'http://localhost:9200/employee/test/14?prettyhttp://localhost:9200/employee/test/4?pretty' -d '{designation: senior software engineer}' curl -XPUT 'http://localhost:9200/employee/test/15?prettyhttp://localhost:9200/employee/test/5?pretty' -d '{designation: team lead}' curl -XPUT 'http://localhost:9200/employee/test/16?prettyrefresh=truehttp://localhost:9200/employee/test/6?prettyrefresh=true' -d '{designation: tl}' curl -XPUT 'http://localhost:9200/employee/test/17?prettyrefresh=truehttp://localhost:9200/employee/test/6?prettyrefresh=true' -d '{designation: lead}' On Thursday, January 9, 2014 2:12:05 PM UTC+5:30, Jayesh Bhoyar wrote: Hi, I have following Synonyms that I want to configure. software engineer = software engineer, se, senior software engineer = senior software engineer , see team lead = team lead, lead, tl So that If I searched for se or Software Engineer it should return me the records having software engineer. What mapping I should apply on Designation field? and what query I should fire to get the result It is possible to use multi_match query? Following are the query to create the records. curl -XPUT 'http://localhost:9200/employee/test/1?pretty' -d '{designation: software engineer}' curl -XPUT 'http://localhost:9200/employee/test/2?pretty' -d '{designation: software engineer}' curl -XPUT 'http://localhost:9200/employee/test/3?pretty' -d '{designation: senior software engineer}' curl -XPUT 'http://localhost:9200/employee/test/4?pretty' -d '{designation: senior software engineer}' curl -XPUT 'http://localhost:9200/employee/test/5?pretty' -d '{designation: team lead}' curl -XPUT 'http://localhost:9200/employee/test/6?prettyrefresh=true' -d '{designation: team lead}' -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/71574363-4a46-4471-be9e-6ef1b0938d60%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Pls help me: i insert log to elasticsearch, but it use too much memory, how to solve it?thanks
Thanks David . * Yes , I test it with curl. If the json data is not too big, There is no problem. The test json format is following:* *{* *name:[user1,user2,user3,],* * product:{},* * price:{}* *}* *The difference is the two json data is :* *The last json data include too many key/value, like the following:* *{* *name:[user1,user2,user3,],* * product:{},* * price:{},* *attr:{* user1:{{costprice:122},{sellprice:124},{stock:12},{sell:122},{},{}], user2:{{costprice:122},{sellprice:124},{stock:12},{sell:122},{},{}],, user3:{{costprice:122},{sellprice:124},{stock:12},{sell:122},{},{}], .. *}* *}* *There are more than 3000 items in attr key. So it used too many memory.* *Thanks again.* On Thursday, January 9, 2014 3:15:59 PM UTC+8, David Pilato wrote: Just wondering if you are hitting the same RAM usage when inserting without thrift? Could you test it? Could you gist as well what gives: curl -XGET 'http://localhost:9200/_nodes?all=truepretty=true' -- *David Pilato* | *Technical Advocate* | *Elasticsearch.com* @dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr Le 9 janvier 2014 at 07:11:33, xjj2...@gmail.com javascript: ( xjj2...@gmail.com javascript:) a écrit: The env is following: --elasticseasrch v0.90( i use 0.90.9 , the problem is still exist). -- java version is 1.7.0_45 On Wednesday, January 8, 2014 6:58:02 PM UTC+8, xjj2...@gmail.com wrote: Dear all: I insert 1 logs to elasticsearch, each log is about 2M, and there are about 3000 keys and values. when i insert about 2, it used about 30G memory, and then elasticsearch is very slow, and it's hard to insert log. Could someone help me how to solve it? Thanks very much. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/caec9b84-c543-4bb3-8cb0-e90113972716%40googlegroups.com . For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d8d1c975-a9f2-47c6-97e4-54ba5f163284%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
How's the encoding handling power of ES?
Hi all, I am wondering how the ElasticSearch deal with different document with different encoding, such as different language. Could you provide me some tutorial about it? Do I need to manually specify the encoding format of the document when posting? Best, Ivan -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7d6f7334-cc7d-4a37-88c5-6237f0d29b05%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Best way to match URLs
Hi! I am using ES together with logstash and we are indexing simple access log files. Our problem is that we want to now the number of image views for a resource which is determined by a specific REST url: GET /resource/id/image = i.e GET /resource/abcde/image This results in millions of different URL´s that all mean = image view. Another problem is that there are other unpredictable resources under /image = GET /resource/abcde/image/anything so a search like url:get AND url:resource AND url:image AND - url:? does not work since I do not know what to exclude I was thinking about using regexp for this, performance is not really a problem (at least not at the moment), since this is mainly for reporting. However, I have not been able to solve it. If using regex, should the field be analyzed or not_analyzed? I have tried with both using a template but I am still unable to get it working. url : { type : multi_field, fields : { name : {type : string, index : analyzed }, facet : {type : string, index : not_analyzed} } } Anyway, any suggestions about how to solve this would be highly appreciated. Kind regards, Johan -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/01e52f9e-6cc3-4723-ba44-b5a5fdbe9fcc%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: No hit using scan/scroll with has_parent filter
Hi Martijn, Thanks for your answer. You can find in the gist below some HTTP conversations made on my ES 0.90.6 node, as well as a link to the Java code responsible for the calls: https://gist.github.com/jblievremont/8331460 Please note that the issue appears only when combining scan/scroll with has_parent filter, as it seems to work using a has_parent query instead. Best regards, -- Jean-Baptiste Lièvremont Le jeudi 9 janvier 2014 00:18:14 UTC+1, Martijn v Groningen a écrit : Hi Jean, Can you share how you execute the scan request with the has_parent filter? (via a gist or something like that) Martijn On 8 January 2014 15:17, Jean-Baptiste Lièvremont jean-baptist...@sonarsource.com javascript: wrote: Hi folks, I use a parent/child mapping configuration which works flawlessly with classic search requests, e.g using has_parent to find child documents with criteria on the parent documents. I am trying to get all child document IDs that match a given set of criteria using scan and scroll, which also works well - until I introduce the has_parent filter, in which case the scroll request returns no hit (although total_hits is correct). Is it a known issue? I can provide sample mapping files and queries with associated/expected results. Please note that this behavior has been noticed on 0.90.6 but is still present in 0.90.9. Thanks, best regards, -- Jean-Baptiste Lièvremont -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/fd7c563e-34f7-4aa8-ab1a-460840ba2af0%40googlegroups.com . For more options, visit https://groups.google.com/groups/opt_out. -- Met vriendelijke groet, Martijn van Groningen -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8bafbc47-a68f-41fa-8730-d17cf1832011%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Pls help me: i insert log to elasticsearch, but it use too much memory, how to solve it?thanks
I see. You probably have to merge mappings with very big mappings! What is your application searching for? Logs? Users? -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 9 janv. 2014 à 10:06, xjj210...@gmail.com a écrit : Thanks David . Yes , I test it with curl. If the json data is not too big, There is no problem. The test json format is following: { name:[user1,user2,user3,], product:{}, price:{} } The difference is the two json data is : The last json data include too many key/value, like the following: { name:[user1,user2,user3,], product:{}, price:{}, attr:{ user1:{{costprice:122},{sellprice:124},{stock:12},{sell:122},{},{}], user2:{{costprice:122},{sellprice:124},{stock:12},{sell:122},{},{}],, user3:{{costprice:122},{sellprice:124},{stock:12},{sell:122},{},{}], .. } } There are more than 3000 items in attr key. So it used too many memory. Thanks again. On Thursday, January 9, 2014 3:15:59 PM UTC+8, David Pilato wrote: Just wondering if you are hitting the same RAM usage when inserting without thrift? Could you test it? Could you gist as well what gives: curl -XGET 'http://localhost:9200/_nodes?all=truepretty=true' -- David Pilato | Technical Advocate | Elasticsearch.com @dadoonet | @elasticsearchfr Le 9 janvier 2014 at 07:11:33, xjj2...@gmail.com (xjj2...@gmail.com) a écrit: The env is following: --elasticseasrch v0.90( i use 0.90.9 , the problem is still exist). -- java version is 1.7.0_45 On Wednesday, January 8, 2014 6:58:02 PM UTC+8, xjj2...@gmail.com wrote: Dear all: I insert 1 logs to elasticsearch, each log is about 2M, and there are about 3000 keys and values. when i insert about 2, it used about 30G memory, and then elasticsearch is very slow, and it's hard to insert log. Could someone help me how to solve it? Thanks very much. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/caec9b84-c543-4bb3-8cb0-e90113972716%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d8d1c975-a9f2-47c6-97e4-54ba5f163284%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9B042F72-32F3-43EE-8BD2-2A32347E0984%40pilato.fr. For more options, visit https://groups.google.com/groups/opt_out.
Re: Converting queries returning certain distinct records to ES
Okay, thank you for your response, here is an attempt of an example of what I am trying to achieve. Lets say I have the documents; { id: 1 name: peter class: 2 grade: b hair:grey } { id:2 name: paul class:2 grade:b hair:purple } { id:3 name:john class:1 grade:b hair:grey } { id:4 name:sandra class:1 grade:a hair:green } { id:5 name:sarah class:1 grade:a hair:green } Initially I want to get only one student from each possible [class, grade] combinaion so I want ES to return peter, john and sandra but not paul or sarah . The grades will range from the letters [a,b,c,d,e] but the class could be anything. Additionally I might want to add a condition to this, such as only getting students with green hair. In that case I would only want to return sandra as while sarah has green hair - they have the same [class,grade] as sandra. I thought about using facets for the first query but I cannot see how that would give me a collection of the right ids to make the second query with. On Thursday, January 9, 2014 7:57:09 AM UTC, David Pilato wrote: May be you could find a way to do that with a single query if you design your documents in another way? Or using facets for the first query and Ids filter for the second? It's hard to tell without a concrete example of JSON documents. -- *David Pilato* | *Technical Advocate* | *Elasticsearch.com* @dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr Le 9 janvier 2014 at 01:28:06, hea...@hodgetastic.com javascript: ( hea...@hodgetastic.com javascript:) a écrit: Hello I am currently trying to migrate an sql application to Elasticsearch. I need to be able to select a collection of results from an index which, for given search conditions, have distinct pairings of two certain columns. In sql I do the following two queries: Query 1: SELECT column_A, column_B, GROUP_CONCAT (table_name..id) id FROM `table_name` WHERE `column_?` = 'something' GROUP BY column_A, column_B, column_? Query 2: SELECT `table_name`.* FROM `table_name ` WHERE `column_?` = 'something' AND (`table_name.id` IN (ids_from_previous_query)) The first query returns me a list of ids from table_name such that each id satisfies the condition `column_?` = 'something' and the record with that id has a distinct [column_A,column_B] The second query then returns me all the records satisfying `column_?` = 'something' but only from that range of ids (I realise I probably do not need to do `column_?` = 'something again in the second query.) The result is that each record returned by the second query has satisfies the condition `column_?` = 'something' and I am only returned one record for each [column_A,column_B] paring. Since there is not really a 'distinct' option yet I am having trouble finding a way replicate this output with ES and wondered if anyone might have any thoughts as how I might go about it? At the moment I am open to any mapping / query combinations that will achieve what I need. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6a857778-0399-4b3c-9973-a3e353436311%40googlegroups.com . For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/49dc9bd6--4398-aabf-7133852907e5%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
issues with timestamp sorting
I have created a timestamp field (not be confused with _timestamp) and want to sort timestamp field in descending order. But the result contains some records out of order. Mapping and the sorting criteria look like : timestamp:{type :date,format:dateOptionalTime,include_in_all:false} SearchRequestBuilder.addSort(timestamp, SortOrder.DESC); The result I get is : 2013-12-26T09:14:09.617Z ,2013-12-26T12:01:07.389Z,2013-12-26T12:00:20.126Z,2013-12-26T11:59:15.594Z,2013-12-26T11:58:00.083Z,2013-12-26T11:55:52.372Z Is it because of dateOptionalTime format or am I missing something? Thanks -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6616a823-4421-4dbe-a440-a14fdce07e18%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
unexpected behavior of pagination using offset and size
My application has the pagination requirement for search. I am using the offset and size option to achieve the pagination. Making quick clicks on pagination sometimes does not give results at all. Does the asynchronous search call bringing any side effects like this? Thanks, -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/356ba278-d5f5-4e8d-87ae-7d6c2d7ab13c%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Architecture question re. routing and multi DC
For redundancy purposes, our system is split into two datacenters. One of the DCs is considered central where all the backoffice systems reside and the other is edge. Recently we started using Logstash with ElasticSearch and Kibana. The architecture we had is: - Scribe server on each instance in our cluster forwards logs to a main scribe instance in the DC. - If the DC is the edge, its main scribe instance forwards all logs to the main scribe instance in central. - From the main (central) scribe server we forward message to Logstash, which in turn get written to ES. Because most logs are only stored but never retrieved, to reduce the traffic between DCs, we thought of using custom routing: - Have elastic search node in each DC (currently we have only one). - Tag each log message with the DC it's originated from and route the log messages according to this tag, so each DC's log messages end up in its own ES instance. Will this work? Is this proper use of ElasticSearch's routing? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0017a4a8-80ca-4fcb-97df-032f9d6858c9%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Logstash Embedded Elasticsearch not starting.
Good Morning, I am running a very basic config of logstash, with the embedded elasticsearch. I am able to launch the logstash embedded elasticsearch successfully, whilst using a local disk for the data directory, however when I use the option : -Des.path.data to specify the data directory to be located on an NFS share, elasticsearch will not start. I'm assuming this is a locking issue log4j, [2014-01-08T15:13:25.560] INFO: org.elasticsearch.node: [Set] version[0.90.3], pid[16971], build[5c38d60/2013-08-06T13:18:31Z] log4j, [2014-01-08T15:13:25.561] INFO: org.elasticsearch.node: [Set] initializing ... log4j, [2014-01-08T15:13:25.561] DEBUG: org.elasticsearch.node: [Set] using home [/srv/log/logstash], config [/srv/log/logstash/config], data [[/srv/log/logstash/data]], logs [/srv/log/logstash/logs], work [/srv/log/logstash/work], plugins [/srv/log/logstash/plugins] log4j, [2014-01-08T15:13:25.567] INFO: org.elasticsearch.plugins: [Set] loaded [], sites [] log4j, [2014-01-08T15:13:25.584] DEBUG: org.elasticsearch.common.compress.lzf: using [UnsafeChunkDecoder] decoder an no further progress. Is there a workaround for this problem, as I have a requirement to use NFS for the data directory. -- A. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9317c2aa-1c96-4fb7-b0b1-614289f537cc%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: cassandra river plugin installation issue
issue solved: in river code, when fetching data from Casandra it uses HFactory.createRangeSlicesQuery(keyspace, STR, STR, STR); to get data and the table which i was using to get data contain Primary Key as int id, after changing that to text it starts pulling data from Cassandra to ES. Thanks -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/050a535b-990b-4dd7-9af9-6caea0e6d4a5%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: SSL and org.elasticsearch.transport.NodeDisconnectedException
We don't have that at this time. Basically, elasticsearch nodes are very often in a backend layer so securing transport is not something really needed as it comes also with a cost. Could you secure your transmissions on a network level? -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 9 janv. 2014 à 13:30, Maciej Stoszko maciek...@gmail.com a écrit : Thanks David, Does it mean that, at least currently, there is no avenue to secure transport layer with SSL? Maciej -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9a48efe8-143f-49ab-b216-7cca6f95f25e%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/A5E112F7-2C92-4C1A-A320-5973EE89757A%40pilato.fr. For more options, visit https://groups.google.com/groups/opt_out.
Re: Problem with highlight after upgrade from 0.20.4 to 0.90.9
So after trying around, I have found out that if just have name and description and not task.name and task.description in the highlight section, the highlighting works. So is TYPE.FIELD not supported for highlights in 0.90.x? Or could I do it with some other syntax? If not, how do you solve the case where you want to query many types with a name field, but have different settings per type? /Calle Den torsdagen den 9:e januari 2014 kl. 11:31:31 UTC+1 skrev Calle Arnesten: Hi, I have updated ElasticSearch from 0.20.4 to 0.90.9 and have problem to get the highlight to work. The following search body (JSON-stringified) worked in the old version: query: { custom_filters_score: { query: { query_string: { query: 'test', fields: ['task.name^3', 'task.description'] } } }, size: 11, highlight: { encoder: 'html', fields: { _all: { number_of_fragments: 5 }, 'task.name': { fragment_size: 200 }, 'task.description': { fragment_size: 100 }, require_field_match: true } } } In the new version it returns: { took: 5, timed_out: false, _shards: { total: 1, successful: 1, failed: 0 }, hits: { total: 2, max_score: 1.5957302, hits: [ { _index: board, _type: task, _id: 9160af7f92b9f5c769351d62650028e0, _score: 1.5957302, _source: { name: Test1, description: } }, { _index: board, _type: task, _id: 9160af7f92b9f5c769351d6265003ae4, _score: 1.5957302, _source: { name: Test2, description: } } ] } } In 0.20.4 each item in the hits array contained a highlight property, but now it doesn't. Why is that not included anymore? Any help is appreciated. /Calle -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3ee83405-f2f6-4db0-bc13-3880d5cff450%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: how to stop running river plugin
When you delete the river (remove _meta doc). -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 9 janv. 2014 à 14:58, shamsul haque shams...@gmail.com a écrit : Hi, i have configured and started river with my ES. But how may i stop OR close my running river, if i want to do so. I have seen public void close() method in River code, when it get called? Thanks -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0fdecad2-aba3-4711-bec0-88216f014634%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0DFC95F7-DD6D-4B61-8EA2-4A2895B3818F%40pilato.fr. For more options, visit https://groups.google.com/groups/opt_out.
Re: How's the encoding handling power of ES?
There is example in index and query in this SO http://stackoverflow.com/questions/8734888/how-to-search-for-utf-8-special-characters-in-elasticsearch hth Jason On Thu, Jan 9, 2014 at 5:13 PM, HongXuan Ji hxua...@gmail.com wrote: Hi all, I am wondering how the ElasticSearch deal with different document with different encoding, such as different language. Could you provide me some tutorial about it? Do I need to manually specify the encoding format of the document when posting? Best, Ivan -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7d6f7334-cc7d-4a37-88c5-6237f0d29b05%40googlegroups.com . For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHO4itx4%3DuWb3xD%2BbyqJ7Zoo1yTQKYkRoeKe6Hd7_DgehGjZpQ%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Corrupt index creation when elasticsearch is killed just after index is created
Sorry my fault. I stand corrected. There is replication=sync and consistency=all, but just for index creation that is triggered by a document creation, and the index does not yet exist (auto creation). It's not there for explicit index creation (where there is no document to be created). In case you explicitly execute index creation, you can add a master node timeout to the operation, and if it exceeds, the operation will return that is was not acknowledged by all nodes. Jörg -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGgoJP%3DtkgSnOBHA366QgesFR4Ft5xztSrr1fX3ZWDDAw%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: allow_explicit_index and _bulk
Opened an issue: https://github.com/elasticsearch/elasticsearch/issues/4668 On Thursday, January 9, 2014 3:39:39 AM UTC-5, Alexander Reelsen wrote: Hey, after having a very quick look, it looks like a bug (or wrong documentation, need to check further). Can you create a github issue? Thanks! --Alex On Wed, Jan 8, 2014 at 11:08 PM, Gabe Gorelick-Feldman gabego...@gmail.com javascript: wrote: The documentation on URL-based access controlhttp://www.elasticsearch.org/guide/en/elasticsearch/reference/current/url-access-control.html implies that _bulk still works if you set rest.action.multi.allow_explicit_index: false, as long as you specify the index in the URL. However, I can't get it to work. POST /foo/bar/_bulk { index: {} } { _id: 1234, baz: foobar } returns explicit index in bulk is not allowed Should this work? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a0d1fa2f-0c28-4142-9f6d-4b28a1695bb3%40googlegroups.com . For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f9fff41c-1b51-40dd-9291-c5bf4d73599c%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: How to configure and implement Synonyms with multi words.
He is a little example of query time multi-word synonyms: https://gist.github.com/mattweber/7374591 Hope this helps. Thanks, Matt Weber On Thu, Jan 9, 2014 at 12:56 AM, Jayesh Bhoyar jsbonline2...@gmail.comwrote: Also I have another scenario where my index is having words like software engineer, se, --- this should get seached when I do search on Software engineer team lead, lead, tl --- this should get seached when I do search on Team Lead Following are the query to create the records. curl -XPUT 'http://localhost:9200/employee/test/11?prettyhttp://localhost:9200/employee/test/1?pretty' -d '{designation: software engineer}' curl -XPUT 'http://localhost:9200/employee/test/12?prettyhttp://localhost:9200/employee/test/2?pretty' -d '{designation: se}' curl -XPUT 'http://localhost:9200/employee/test/13?prettyhttp://localhost:9200/employee/test/3?pretty' -d '{designation: sse}' curl -XPUT 'http://localhost:9200/employee/test/14?prettyhttp://localhost:9200/employee/test/4?pretty' -d '{designation: senior software engineer}' curl -XPUT 'http://localhost:9200/employee/test/15?prettyhttp://localhost:9200/employee/test/5?pretty' -d '{designation: team lead}' curl -XPUT 'http://localhost:9200/employee/test/16?prettyrefresh=truehttp://localhost:9200/employee/test/6?prettyrefresh=true' -d '{designation: tl}' curl -XPUT 'http://localhost:9200/employee/test/17?prettyrefresh=truehttp://localhost:9200/employee/test/6?prettyrefresh=true' -d '{designation: lead}' On Thursday, January 9, 2014 2:12:05 PM UTC+5:30, Jayesh Bhoyar wrote: Hi, I have following Synonyms that I want to configure. software engineer = software engineer, se, senior software engineer = senior software engineer , see team lead = team lead, lead, tl So that If I searched for se or Software Engineer it should return me the records having software engineer. What mapping I should apply on Designation field? and what query I should fire to get the result It is possible to use multi_match query? Following are the query to create the records. curl -XPUT 'http://localhost:9200/employee/test/1?pretty' -d '{designation: software engineer}' curl -XPUT 'http://localhost:9200/employee/test/2?pretty' -d '{designation: software engineer}' curl -XPUT 'http://localhost:9200/employee/test/3?pretty' -d '{designation: senior software engineer}' curl -XPUT 'http://localhost:9200/employee/test/4?pretty' -d '{designation: senior software engineer}' curl -XPUT 'http://localhost:9200/employee/test/5?pretty' -d '{designation: team lead}' curl -XPUT 'http://localhost:9200/employee/test/6?prettyrefresh=true' -d '{designation: team lead}' -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/71574363-4a46-4471-be9e-6ef1b0938d60%40googlegroups.com . For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAJ3KEoDVd04Mx3Wh1ZEtpgoSeekrhGaGUCCO5Lut%3DnKgdhOGiw%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Filter and Query same taking some time
Use a filtered query, not an outer filter. You only want to use that outer filter when you are faceting and don't want the filter to change the facet counts. http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-filtered-query.html Thanks, Matt Weber On Thu, Jan 9, 2014 at 1:13 AM, Arjit Gupta arjit...@gmail.com wrote: I had 13 Million documents and with the same query I see Filters performing worse then query filters are taking 400ms where as query is taking 300 ms 1. Filter { size : 100, query : { match_all : { } }, filter : { bool : { must : { term : { color : red } } } }, version : true } 2. Query { size : 100, query : { bool : { must : { match : { color : { query : red, type : boolean, operator : AND } } } } }, version : true } Thanks , Arjit On Thu, Jan 9, 2014 at 1:15 PM, David Pilato da...@pilato.fr wrote: Yeah 10 documents is not that much! Not sure if you can notice a difference here as probably everything could be loaded in file system cache. -- *David Pilato* | *Technical Advocate* | *Elasticsearch.com* @dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr Le 9 janvier 2014 at 08:43:13, Arjit Gupta (arjit...@gmail.com//arjit...@gmail.com) a écrit: I have 100,000 documents which are similar. In response I am getting the whole document not just Id. I am executing the query multiple times. Thanks , Arjit On Thu, Jan 9, 2014 at 1:06 PM, David Pilato da...@pilato.fr wrote: You probably won't see any difference the first time you execute it unless you are using warmers. With a second query, you should see the difference. How many documents you have in your dataset? -- *David Pilato* | *Technical Advocate* | *Elasticsearch.com* @dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr Le 9 janvier 2014 at 06:14:06, Arjit Gupta (arjit...@gmail.com//arjit...@gmail.com) a écrit: Hi, I had implemented ES search query for all our use cases but when i came to know that some of our use cases can be solved by filters I implemented that but I dont see any gain (in response time) in filters. My search queries are 1. Filter { size : 100, query : { match_all : { } }, filter : { bool : { must : { term : { color : red } } } }, version : true } 2. Query { size : 100, query : { bool : { must : { match : { color : { query : red, type : boolean, operator : AND } } } } }, version : true } By default the term query should be cached but I dont see a performance gain. Do i need to change some parameter also ? I am using ES 0.90.1 and with 16Gb of heap space given to ES. Thanks, Arjit -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/326a6640-d887-46b4-a8e7-ec15a1c9dc98%40googlegroups.com . For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to a topic in the Google Groups elasticsearch group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/uknnBHMnZLk/unsubscribe. To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/etPan.52ce519b.75c6c33a.1449b%40MacBook-Air-de-David.local. For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CADe%2BHd-RzJxTrtt8gVOS6cxa%3DXNZ%3Dwa%2Bv8Vnwnqigd5gfnJ0fw%40mail.gmail.com . For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to a topic in the Google Groups elasticsearch group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/uknnBHMnZLk/unsubscribe. To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit
Is there a kind of query/rescore/similarity magic that lets me know if all the terms in a field are matched?
I'm looking to boost matches that where all the terms in the field match more than I'm getting out of the default similarity. Is there some way to ask Elasticsearch to do that? I'm ok with only checking in some small window of top documents or really anything other than a large performance hit. To be honest I haven't played too much with similarities so maybe what I want is there. Thanks! Nik -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPmjWd3y1OME0C31M69Ugs71T%2BnU2b%2Bpyq45Wga71vOv1GTdTQ%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Is there a kind of query/rescore/similarity magic that lets me know if all the terms in a field are matched?
Nik, No, there is not. There's a work-around in which the number of terms in a field can be stored in another field during indexing time. And then you can analyze your query string to count the number of terms, and then use that count to match against the documents that have the same count. But consider the following field: text : Very Big Dog Three terms in the field's value, right? And consider the query: +text:very +text:very +text:very As in: { bool : { must : [ { match : { text : { query : very, type : boolean } } }, { match : { text : { query : very, type : boolean } } }, { match : { text : { query : very, type : boolean } } } ] } } Three query terms, right? But it will match the field, and the term counts will match, and therefore you will then be told that *Very Very Very* is a perfect match for *Very Big Dog*. Oops! This is a Lucene limitation. Probably not a really big deal; I only know of two search engines that can properly handle duplicate terms: Google's, and the one I wrote in my previous life. But it is something that would be a very nice and useful feature for Lucene. Since Lucene already knows the word positions, it can verify that each term matches a unique word position (which is what I did in mine). Brian On Thursday, January 9, 2014 11:18:50 AM UTC-5, Nikolas Everett wrote: I'm looking to boost matches that where all the terms in the field match more than I'm getting out of the default similarity. Is there some way to ask Elasticsearch to do that? I'm ok with only checking in some small window of top documents or really anything other than a large performance hit. To be honest I haven't played too much with similarities so maybe what I want is there. Thanks! Nik -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9bc2284e-0f01-4b4e-aded-93db0230d4c9%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Upgrades causing Elastic Search downtime
Perhaps I am missing some functionality since I am still on version 0.90.2, but wouldn't you have to disable/enable allocation after each server restart during a rolling upgrade? A restarted node will not host any shards with allocation disabled. Cheers, Ivan On Wed, Jan 8, 2014 at 5:48 PM, Mark Walkom ma...@campaignmonitor.comwrote: Disabling allocation is definitely a temporary only change, you can set it back once you're upgrades are done. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 9 January 2014 02:47, Jenny Sivapalan jennifer.sivapa...@gmail.comwrote: Thanks both for the replies. Our rebalance process doesn't take too long (~5 mins per node). I had some of the plugins (head, paramedic, bigdesk) open as I was closing down the old nodes and didn't see any split brain issue although I agree we can lead ourselves down this route by doubling the instances. We want our cluster to rebalance as we bring nodes in and out so disabling is not going to work for us unless I'm misunderstanding? On Tuesday, 7 January 2014 22:16:46 UTC, Mark Walkom wrote: You can also use cluster.routing.allocation.disable_allocation to reduce the need of waiting for things to rebalance. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 8 January 2014 04:41, Ivan Brusic iv...@brusic.com wrote: Almost elasticsearch should support clusters of nodes with different minor versions, I have seen issues between minor versions. Version 0.90.8 did contain an upgrade of Lucene (4.6), but that does not look like it would cause your issue. You could look at the github issues tagged 0.90.[8-9] and see if something applies in your case. A couple of points about upgrading: If you want to use the double-the-nodes techniques (which should not be necessary for minor version upgrades), you could decommission a node using the Shard API. Here is a good writeup: http://blog.sematext. com/2012/05/29/elasticsearch-shard-placement-control/ Since you doubled the amount of nodes in the cluster, the minimum_master_nodes setting would be temporarily incorrect and potential split-brain clusters might occur. In fact, it might have occurred in your case since the cluster state seems incorrect. Merely hypothesizing. Cheers, Ivan On Tue, Jan 7, 2014 at 9:26 AM, Jenny Sivapalan jennifer@gmail.com wrote: Hello, We've upgraded Elastic Search twice over the last month and have experienced downtime (roughly 8 minutes) during the roll out. I'm not sure if it something we are doing wrong or not. We use EC2 instances for our Elastic Search cluster and cloud formation to manage our stack. When we deploy a new version or change to Elastic Search we upload the new artefact, double the number of EC2 instances and wait for the new instances to join the cluster. For example 6 nodes form a cluster on v 0.90.7. We upload the 0.90.9 version via our deployment process and double the number nodes for the cluster (12). The 6 new nodes will join the cluster with the 0.90.9 version. We then want to remove each of the 0.90.7 nodes. We do this by shutting down the node (using the plugin head), wait for the cluster to rebalance the shards and then terminate the EC2 instances. Then repeat with the next node. We leave the master node until last so that it does the re-election just once. The issue we have found in the last two upgrades is that while the penultimate node is shutting down the master starts throwing errors and the cluster goes red. To fix this we've stopped the Elastic Search process on master and have had to restart each of the other nodes (though perhaps they would have rebalanced themselves in a longer time period?). We find that we send an increase error response to our clients during this time. We've set out queue size for search to 300 and we start to see the queue gets full: at java.lang.Thread.run(Thread.java:724) 2014-01-07 15:58:55,508 DEBUG action.search.type[Matt Murdock] [92036651] Failed to execute fetch phase org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: rejected execution (queue capacity 300) on org.elasticsearch.action. search.type.TransportSearchQueryThenFetchAction$AsyncAction$2@23f1bc3 at org.elasticsearch.common.util.concurrent.EsAbortPolicy. rejectedExecution(EsAbortPolicy.java:61) at java.util.concurrent.ThreadPoolExecutor.reject( ThreadPoolExecutor.java:821) But also we see the following error which we've been unable to find the diagnosis for: 2014-01-07 15:58:55,530 DEBUG index.shard.service [Matt Murdock] [index-name][4] Can not build 'doc stats' from engine shard state [RECOVERING] org.elasticsearch.index.shard.IllegalIndexShardStateException: [index-name][4] CurrentState[RECOVERING] operations only allowed
Re: Searching indexed fields without analysing
Chris, I updated one of my tests to reproduce your issue. My text field is a multi-field where *text.na* is the text field without any analysis at all. This Lucene query does not find anything at all: { bool : { must : { query_string : { query : *text.na:Immortal-Li** } } } } But this one works fine: { bool : { must : { prefix : { text.na : { *prefix* : *Immortal-Li* } } } } } And returns the two documents that I expected: { _index : mortal , _type : elf , _id : 1 , _version : 1 , _score : 1.0 , _source : { cn : Celeborn , text : Immortal-Lives forever } } { _index : mortal , _type : elf , _id : 2 , _version : 1 , _score : 1.0 , _source : { cn : Galadriel , text : Immortal-Lives forever } } Note that in both cases, the query's case must match since the field value is not analyzed at all. I'm not sure if this is a true bug. In general, I find Lucene syntax somewhat useful for ad-hoc queries, and I find their so-called Simple Query Parser syntax to be completely unable to find anything when there is no _all field, whether or not I specify a default field. (But that's another issue I'm going to ask about in the near future.) Brian On Thursday, January 9, 2014 8:27:04 AM UTC-5, Chris H wrote: Hi, Jun. That doesn't seem to work. For a user with the username bob.smith-jones: - bob.smith-jones - matches - bob.smith- - matches - bob.smi* - matches - bob.smith-j* - no results - bob.smith\-j* - no results Also, a $ isn't one of the special characters. Thanks. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6cb908eb-9ca7-4f05-815f-a868c45f9f66%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Corrupt index creation when elasticsearch is killed just after index is created
*Never, never, never* kill -9 and expect any application to properly and cleanly shut down. Never. The -9 signal cannot be caught by the process to which it is directed. The process is ended in the middle for whatever it is doing. Issue a normal kill, and then ES (via the JVM) will have a chance to finish up whatever it is working on, and then shut down cleanly. Brian On Wednesday, December 25, 2013 11:36:32 PM UTC-5, tarang dawer wrote: i have reliably recreated this many times, happens while creating index on a single node, (default 5 shards). i have set action.auto_create_index: false , discovery.zen.ping.multicast.enabled: false node.master=true so i am creating indices via java API, . i kill(*Kill -9* ) the elasticsearch immediately after the index is created. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1ba7e47f-0d9e-44fd-b1b3-628da214b499%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Kibana3 - terms panel with range facet chart
Hey all, I just started using ElasticSearch with LogStash and Kibana. I'm able to extract fields from my log statements using logstash/grok. In Kibana I have taken some of these fields and created stats panels using them for stats like total/mean/min/max which works great for just seeing a calculated number value quickly. What I would like to do next is create a bar chart that can display the count of occurrences for my extracted field within different ranges. So say my field is called upload_size, I would like to create a pie chart that displays the count of files uploaded within defined ranges. For example I would like to see counts of upload_size fields with values in these ranges: 0-10KB, 10KB-100KB, 100KB-1MB, 1MB-10MB, 10MB-100MB, 100MB-1GB, 1GB+ and plotted in a pie chart. I've experimented with the terms panel creating a pie chart but don't don't see a way to define ranges. It seems this would be possible using ElasticSearch range facets: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-facets-range-facet.html Is it possible to do this currently in Kibana3? If not, is this on the roadmap? I am using Kibana3 milestone 4. Thanks, Erik -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/15c8befa-4d0e-4f58-b193-d781d48a05da%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Percolate Query using _size matches no documents
Is filtering on the _size field allowed on percolate requests? Adding into a percolate query in either a filter or the query section a _size range matches no documents. Walking through reproducing the problem: Register the query: curl -XPUT 'http://localhost:9200/_percolator/test_index/queryNamedSue' -d '{ query : { constant_score: { filter: { and: [ { query : { query_string : { query : batman, default_field : all } } }, { bool : { must : [ { term : { ni.language : en } }, { range : { _size : { from : 0, to : 1, include_lower : true, include_upper : true } } } ] } } ] } } } }' Then percolate a document (aside: 'ni.content' field in our mapping is a default search field): curl -XGET 'http://localhost:9200/test_index/testType/_percolate' -d '{ doc: { ni: { content: So Batman walks into a bar, language: en } } }' Which results in {ok:true,matches:[queryNamedSue]} Now if the query is changed to include a _size range, curl -XPUT 'http://localhost:9200/_percolator/test_index/queryNamedSue' -d '{ query : { constant_score: { filter: { and: [ { query : { query_string : { query : batman, default_field : all } } }, { bool : { must : [ { term : { ni.language : en } }, { range : { _size : { from : 0, to : 1, include_lower : true, include_upper : true } } } ] } } ] } } } }' Percolating the same document yields {ok:true,matches:[]} I have researched and found that percolating with a mapping that enables and stores _size was failing over a year ago, but this issue was patched: https://github.com/elasticsearch/elasticsearch/pull/2353. We set a default template that for all types in the mapping enables _size and sets it to store. Our percolator node uses the following configuration: Index Settings: index.number_of_shards: variesBasedOnWorkload index.number_of_replicas: 0 index.auto_expand_replicas: false index.dynamic: true index.mapper.dynamic: true index.store.compress.stored: true index.store.compress.tv: true index.term_index_divisor: 4 index.merge.scheduler.max_thread_count: 1 Node Settings: cache.memory.direct: false http.enabled: false gateway.type: none index.store.type: memory -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f1f339b5-3419-4e91-a962-4875eb7def78%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Using NOT in a nested filter
I am having trouble with a filter. I have items in my index, with nested ratings curl -XPOST http://localhost:9200/nestedfilters/item/_mapping; -d ' { item : { properties : { description : { type : string }, ratings : { type : nested, properties : { rater_username : { type : string, index : not_analyzed }, rating : { type : integer, index : not_analyzed } } } } } } ' I want to be able to find items where a certain user has not rated the item. I have tried using NOT, but it finds anything rated by anybody else, regardless of whether the specific user has rated it. I can't seem to figure out how to use a MISSING filter either. Here is what I have tried: curl -XPOST http://localhost:9200/nestedfilters/item/_search?pretty=true; -d ' { query : { match_all : {} }, filter : { nested : { path : ratings, filter : { not : { term : { ratings.rater_username : user1 } } } } } } ' and curl -XPOST http://localhost:9200/nestedfilters/item/_search?pretty=true; -d ' { query : { match_all : {} }, filter : { nested : { path : ratings, filter : { and : [{ term : { ratings.rater_username : user1 } },{ missing : { field : ratings.rating } }] } } } } ' Here is the gist with a full example: https://gist.github.com/nathanmoon/8339950. Is there another way I haven't thought of to craft a filter like this? Or do I need to index my data differently to support this type of filtering? Thanks for any help! Nathan -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/fc6ce18d-2923-4b87-b992-fc81a72c69a4%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Restarting an active node without needing to recover all data remotely.
Just wanted to add a quick note: long recovery times (due to divergence of shards between primary/replica) is an issue that we will be an addressing. No ETA as of yet, but something that is on the roadmap. :) -Zach On Wednesday, December 4, 2013 7:48:04 PM UTC-5, Greg Brown wrote: Thanks for the many responses, they were very helpful. For posterity, I wrote up a more detailed post of how we are managing restart times for our cluster: http://gibrown.wordpress.com/2013/12/05/managing-elasticsearch-cluster-restart-time/ -Greg -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c0336689-f647-4b60-837e-af8c2af6a9dc%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Using NOT in a nested filter
Oh right. That should have been obvious. It seems to be working great that way. Thanks! Nathan On Jan 9, 2014, at 1:10 PM, Sloan Ahrens sl...@stacksearch.com wrote: You were close. You just had the nested and not filters in the wrong order, basically. Your (first) query says return items that have a rating with 'ratings.rater_username' not equal to 'user1'. And so you get the first item, since it meets that requirement. What you really want to say is return items for which all ratings have 'ratings.rater_username' not equal to 'user1'. Here is the query you want: curl -XPOST http://localhost:9200/nestedfilters/item/_search; -d' { query: { match_all: {} }, filter: { not: { nested: { path: ratings, filter: { term: { ratings.rater_username: user1 } } } } } }' Here is a runnable example you can play with (you will need ES installed and running at localhost:9200, or supply another endpoint): http://sense.qbox.io/gist/289ceb80480db8b6574d5f879358e50c97aaf5da - Co-Founder and CTO, StackSearch, Inc. Hosted Elasticsearch at http://qbox.io -- View this message in context: http://elasticsearch-users.115913.n3.nabble.com/Using-NOT-in-a-nested-filter-tp4047349p4047353.html Sent from the ElasticSearch Users mailing list archive at Nabble.com. -- You received this message because you are subscribed to a topic in the Google Groups elasticsearch group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/7yWbMCYmAFw/unsubscribe. To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1389298238074-4047353.post%40n3.nabble.com. For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/128FBB57-B971-4D1C-A3A6-E4F5A3F2BC3D%40gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
Setup parent/child using rivers
I am reading data from Sql Server database/table using jdbc-river currently. As of now I have created a type for each table in my database. As next step in my implementation I would like to use parent/child types so that I can translate the relationship between my sql tables and store them. Table1 Col_id| name| prop1|prop2|prop3 child_table1 col_id| table_id| child_prop1|child_prop2|child_prop3 curl -XPUT 'localhost:9200/_river/parent/_meta' -d '{ type : jdbc, jdbc : { driver : com.mysql.jdbc.Driver, url : jdbc:mysql://localhost:3306/test, user : , password : , sql : select * from table1, index : index1, type : parent } }' curl -XPUT 'localhost:9200/_river/child/_meta' -d '{ type : jdbc, jdbc : { driver : com.mysql.jdbc.Driver, url : jdbc:mysql://localhost:3306/test, user : , password : , sql : select * from child_table1, index : index1, type : child } }' curl -XPOST 'localhost:9200/_river/child/_mapping' -d '{ child:{ _parent: {type: parent} } }' I would like to store my data in the following format { id: 1, name: name1, prop1: data, prop2: data, prop3: data, child: [ { child_prop1: data, child_prop2: data, child_prop3: data, } { child_prop1: data1, child_prop2: data1, child_prop3: data1, } ]} Can anyone comment on how can I use jdbc-rivers to store my data as parent/child type for above scenario. thanks -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/de9ede70-45f2-45d5-bdfb-143d95852262%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Upgrades causing Elastic Search downtime
That setting tells the nodes to hold the shards they currently have, and in the event of a node going down for a restart/upgrade, don't redistribute across the cluster. When you bring the rebooted/upgraded node back it'll locally reinitialise the shards it still has. You can set that setting back to false when you have completed the upgrades/restarts and the cluster can rebalance if it feels the need to. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 10 January 2014 04:07, Ivan Brusic i...@brusic.com wrote: Perhaps I am missing some functionality since I am still on version 0.90.2, but wouldn't you have to disable/enable allocation after each server restart during a rolling upgrade? A restarted node will not host any shards with allocation disabled. Cheers, Ivan On Wed, Jan 8, 2014 at 5:48 PM, Mark Walkom ma...@campaignmonitor.comwrote: Disabling allocation is definitely a temporary only change, you can set it back once you're upgrades are done. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 9 January 2014 02:47, Jenny Sivapalan jennifer.sivapa...@gmail.comwrote: Thanks both for the replies. Our rebalance process doesn't take too long (~5 mins per node). I had some of the plugins (head, paramedic, bigdesk) open as I was closing down the old nodes and didn't see any split brain issue although I agree we can lead ourselves down this route by doubling the instances. We want our cluster to rebalance as we bring nodes in and out so disabling is not going to work for us unless I'm misunderstanding? On Tuesday, 7 January 2014 22:16:46 UTC, Mark Walkom wrote: You can also use cluster.routing.allocation.disable_allocation to reduce the need of waiting for things to rebalance. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 8 January 2014 04:41, Ivan Brusic iv...@brusic.com wrote: Almost elasticsearch should support clusters of nodes with different minor versions, I have seen issues between minor versions. Version 0.90.8 did contain an upgrade of Lucene (4.6), but that does not look like it would cause your issue. You could look at the github issues tagged 0.90.[8-9] and see if something applies in your case. A couple of points about upgrading: If you want to use the double-the-nodes techniques (which should not be necessary for minor version upgrades), you could decommission a node using the Shard API. Here is a good writeup: http://blog.sematext. com/2012/05/29/elasticsearch-shard-placement-control/ Since you doubled the amount of nodes in the cluster, the minimum_master_nodes setting would be temporarily incorrect and potential split-brain clusters might occur. In fact, it might have occurred in your case since the cluster state seems incorrect. Merely hypothesizing. Cheers, Ivan On Tue, Jan 7, 2014 at 9:26 AM, Jenny Sivapalan jennifer@gmail.com wrote: Hello, We've upgraded Elastic Search twice over the last month and have experienced downtime (roughly 8 minutes) during the roll out. I'm not sure if it something we are doing wrong or not. We use EC2 instances for our Elastic Search cluster and cloud formation to manage our stack. When we deploy a new version or change to Elastic Search we upload the new artefact, double the number of EC2 instances and wait for the new instances to join the cluster. For example 6 nodes form a cluster on v 0.90.7. We upload the 0.90.9 version via our deployment process and double the number nodes for the cluster (12). The 6 new nodes will join the cluster with the 0.90.9 version. We then want to remove each of the 0.90.7 nodes. We do this by shutting down the node (using the plugin head), wait for the cluster to rebalance the shards and then terminate the EC2 instances. Then repeat with the next node. We leave the master node until last so that it does the re-election just once. The issue we have found in the last two upgrades is that while the penultimate node is shutting down the master starts throwing errors and the cluster goes red. To fix this we've stopped the Elastic Search process on master and have had to restart each of the other nodes (though perhaps they would have rebalanced themselves in a longer time period?). We find that we send an increase error response to our clients during this time. We've set out queue size for search to 300 and we start to see the queue gets full: at java.lang.Thread.run(Thread.java:724) 2014-01-07 15:58:55,508 DEBUG action.search.type[Matt Murdock] [92036651] Failed to execute fetch phase org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: rejected execution (queue capacity 300) on org.elasticsearch.action.
Re: Upgrades causing Elastic Search downtime
That is definitely not the behavior I have ever seen with elasticsearch. If you restart a node with allocation disabled, the restarted node will have no shards and the shards that it should contain are marked as unassigned. I have never seen a node reinitialize the shards it has. Cheers, Ivan On Thu, Jan 9, 2014 at 3:58 PM, Mark Walkom ma...@campaignmonitor.comwrote: That setting tells the nodes to hold the shards they currently have, and in the event of a node going down for a restart/upgrade, don't redistribute across the cluster. When you bring the rebooted/upgraded node back it'll locally reinitialise the shards it still has. You can set that setting back to false when you have completed the upgrades/restarts and the cluster can rebalance if it feels the need to. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQDJmdA8q_HC-Vsv4ucsj-p4AicAdBpz%3DZju6dohQuXhbw%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: How's the encoding handling power of ES?
Hi, Jason Thanks for the reply. I read the post. I am also wondering what the encoding process of ES works and what's the underlying encoding used in ES to store data? Do you have some documents about these? Thanks,! Regards, Ivan Jason Wee於 2014年1月9日星期四UTC+8下午10時08分26秒寫道: There is example in index and query in this SO http://stackoverflow.com/questions/8734888/how-to-search-for-utf-8-special-characters-in-elasticsearch hth Jason On Thu, Jan 9, 2014 at 5:13 PM, HongXuan Ji hxu...@gmail.comjavascript: wrote: Hi all, I am wondering how the ElasticSearch deal with different document with different encoding, such as different language. Could you provide me some tutorial about it? Do I need to manually specify the encoding format of the document when posting? Best, Ivan -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7d6f7334-cc7d-4a37-88c5-6237f0d29b05%40googlegroups.com . For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/bd32a1c8-1718-4308-bfc6-f3d91ee4f2b7%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Running out of memory when parsing the large text file.
Hi all, I post several large text files, which are about 20~30MB and contains all the text, into ES. And I use the attachment mapper to be the field type to store these file. It cost memory very much. Even when I post one file, the used memory grows from about 150MB to 250MB. BTW, I use the default tokenizer for these field. Although this file can be generated many tokens, but what I don't understand is the memory cost. Does it store all the tokens into memory? Ideas? Cheers, Ivan -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2f200f67-7024-4cdd-9c68-05875f0155ca%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Searching indexed fields without analysing
Hi Chris, I recreate your issue to the following gist. https://gist.github.com/johtani/8346404 And I try to change query as follows: User_Name.raw:bob.smith-jones - matches User_Name.raw:bob.smi* - matches User_Name.raw:bob.smith-j* - matches User_Name.raw:bob.smith\-j* - matches I use User_Name.raw field instead of User_Name. Sorry, not necessary to escape… And I don’t know why do not work Brian example’s query_string query… Does it make sense? Is this understanding mistaken? Jun Ohtani joht...@gmail.com blog : http://blog.johtani.info twitter : http://twitter.com/johtani 2014/01/10 2:09、InquiringMind brian.from...@gmail.com のメール: Chris, I updated one of my tests to reproduce your issue. My text field is a multi-field where text.na is the text field without any analysis at all. This Lucene query does not find anything at all: { bool : { must : { query_string : { query : text.na:Immortal-Li* } } } } But this one works fine: { bool : { must : { prefix : { text.na : { prefix : Immortal-Li } } } } } And returns the two documents that I expected: { _index : mortal , _type : elf , _id : 1 , _version : 1 , _score : 1.0 , _source : { cn : Celeborn , text : Immortal-Lives forever } } { _index : mortal , _type : elf , _id : 2 , _version : 1 , _score : 1.0 , _source : { cn : Galadriel , text : Immortal-Lives forever } } Note that in both cases, the query's case must match since the field value is not analyzed at all. I'm not sure if this is a true bug. In general, I find Lucene syntax somewhat useful for ad-hoc queries, and I find their so-called Simple Query Parser syntax to be completely unable to find anything when there is no _all field, whether or not I specify a default field. (But that's another issue I'm going to ask about in the near future.) Brian On Thursday, January 9, 2014 8:27:04 AM UTC-5, Chris H wrote: Hi, Jun. That doesn't seem to work. For a user with the username bob.smith-jones: • bob.smith-jones - matches • bob.smith- - matches • bob.smi* - matches • bob.smith-j* - no results • bob.smith\-j* - no results Also, a $ isn't one of the special characters. Thanks. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6cb908eb-9ca7-4f05-815f-a868c45f9f66%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out. signature.asc Description: Message signed with OpenPGP using GPGMail
Re: 1.0.0 Beta2 - GET children for Parent/Child does not seem to work
Try with adding ?routing=PARENTID where PARENTID is equal to the parent ID for a given child HTH -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 10 janv. 2014 à 01:09, Yuri Panchenko yuri.panche...@gmail.com a écrit : Hi, I'm doing a simple test with 1.0.0 Beta 2. I've indexed a parent record and three children. The head plugin shows all the children, and the search endpoint returns all three children with different id's. But, for some strange reason, I can only GET by id one of the children. Does someone have a clue, or could this be a bug? curl -XGET localhost:9200/d3/transactions/_search?pretty { took : 2, timed_out : false, _shards : { total : 5, successful : 5, failed : 0 }, hits : { total : 3, max_score : 1.0, hits : [ { _index : d3, _type : transactions, _id : 3, _score : 1.0, _source : { date : 2012-12-01, description : Nail polish, amount : 80.00} }, { _index : d3, _type : transactions, _id : 2, _score : 1.0, _source : { date : 2012-10-14, description : Nail polish, amount : 70.00} }, { _index : d3, _type : transactions, _id : 1, _score : 1.0, _source : { date : 2013-01-01, description : Nail polish, amount : 75.50} } ] } } curl -XGET localhost:9200/d3/transactions/1?pretty { _index : d3, _type : transactions, _id : 1, _version : 2, exists : true, _source : { date : 2013-01-01, description : Nail polish, amount : 75.50} } curl -XGET localhost:9200/d3/transactions/2?pretty { _index : d3, _type : transactions, _id : 2, exists : false } curl -XGET localhost:9200/d3/transactions/3?pretty { _index : d3, _type : transactions, _id : 3, exists : false } -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/41376241-31ab-488a-bac8-19618cbc60be%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/143F639E-7946-4F98-B003-B47FDD84F2C3%40pilato.fr. For more options, visit https://groups.google.com/groups/opt_out.
Re: Searching indexed fields without analysing
If it helps, here are my index settings and mappings. Note that I chose the name text.na as the non-analyzed form, not text.raw. Perhaps I should follow convention? But for now, a rose by any other name is still not analyzed: { settings : { index : { number_of_shards : 1, refresh_interval : 1s, analysis : { char_filter : { }, filter : { english_snowball_filter : { type : snowball, language : English } }, analyzer : { english_stemming_analyzer : { type : custom, tokenizer : standard, filter : [ standard, lowercase, asciifolding, english_snowball_filter ] }, english_standard_analyzer : { type : custom, tokenizer : standard, filter : [ standard, lowercase, asciifolding ] } } } } }, mappings : { _default_ : { dynamic : strict }, ghost : { _all : { enabled : false }, _ttl : { enabled : true, default : 1.9m }, properties : { cn : { type : string, analyzer : english_stemming_analyzer }, text : { type : multi_field, fields : { text : { type : string, analyzer : english_stemming_analyzer, position_offset_gap : 4 }, std : { type : string, analyzer : english_standard_analyzer, position_offset_gap : 4 }, na : { type : string, index : not_analyzed } } } } }, *elf* : { _all : { enabled : false }, _ttl : { enabled : true }, properties : { cn : { type : string, analyzer : english_stemming_analyzer }, *text* : { type : multi_field, fields : { text : { type : string, analyzer : english_stemming_analyzer, position_offset_gap : 4 }, std : { type : string, analyzer : english_standard_analyzer, position_offset_gap : 4 }, *na* : { type : string, index : *not_analyzed* } } } } } } } Brian On Thursday, January 9, 2014 10:38:15 PM UTC-5, Jun Ohtani wrote: Hi Chris, I recreate your issue to the following gist. https://gist.github.com/johtani/8346404 And I try to change query as follows: User_Name.raw:bob.smith-jones - matches User_Name.raw:bob.smi* - matches User_Name.raw:bob.smith-j* - matches User_Name.raw:bob.smith\-j* - matches I use User_Name.raw field instead of User_Name. Sorry, not necessary to escape… And I don’t know why do not work Brian example’s query_string query… -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b0aece84-052c-4efc-8a25-1b42850fefe4%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: java client: typeExists() returns false after successful bulk index - why?
A quick guess: The first one works because the first document for that type is indexed and therefore the type is created when the operation returns. But the second one doesn't work because there is a refresh interval between the completion of a bulk load operation and the actual document being added. And since it's the first document in the type, the type won't exist until that first document is indexed. Which is likely exactly what you want: Bulk operations need to defer until they are processed to allow for optimizations. I don't know Lucene internals, but a B+Tree loads vastly quicker when keys are presorted in bulk instead of added and committed one by one. The experts can chime in later, and if I'm wrong or off base anywhere I welcome the correction! Brian -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6f3374ff-b623-47ca-9e93-3eb2630b6ef3%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: java client: typeExists() returns false after successful bulk index - why?
yes! adding a sleep after Future.get() of bulk op 'fixed' my test - thank you. what you said re: bulk op was submitted but not processed makes sense (perhaps there is a separate API to query for op's completion status?) but what is puzzling to me is that comments in source of *BulkResponse* seem to imply it is constructed *after* op completes: Holding a response for each item responding (in order) of the * bulk requests. Each item holds the index/type/id is operated on, and if it failed or not (with the * failure message). ..thus I was expecting that by the time *ListenableActionFutureBulkResponse.get() *returns the op is actually completed (not just submitted). Otherwise status properties in embedded BulkItemResponse would not be useful, right? On Thu, Jan 9, 2014 at 8:13 PM, InquiringMind brian.from...@gmail.comwrote: A quick guess: The first one works because the first document for that type is indexed and therefore the type is created when the operation returns. But the second one doesn't work because there is a refresh interval between the completion of a bulk load operation and the actual document being added. And since it's the first document in the type, the type won't exist until that first document is indexed. Which is likely exactly what you want: Bulk operations need to defer until they are processed to allow for optimizations. I don't know Lucene internals, but a B+Tree loads vastly quicker when keys are presorted in bulk instead of added and committed one by one. The experts can chime in later, and if I'm wrong or off base anywhere I welcome the correction! Brian -- You received this message because you are subscribed to a topic in the Google Groups elasticsearch group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/V1A1HbJFio4/unsubscribe. To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6f3374ff-b623-47ca-9e93-3eb2630b6ef3%40googlegroups.com . For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAJwaA22WFt%2B%2BQ2z%3DvBMpL_8ChjBB19OQcDWtTzwy8bd5xQ1sjw%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Elasticsearch 0.90 insallation with .rpm and logging
Had to add this to /usr/share/elasticsearch/bin/elasticsearch So any startup of elasticsearch will pickup ES_JAVA_OPTS=-Des.config=/etc/elasticsearch/elasticsearch.yml -Des.path.conf=/etc/elasticsearch/ -Des.path.home=/usr/share/elasticsearch -Des.path.logs=/var/log/elasticsearch -Des.path.data=/var/lib/elasticsearch -Des.path.work=/tmp/elasticsearch -Des.path.plugins=/usr/share/elasticsearch/plugins Is this a defect in RPM distribution? We do not want to edit anything post installation. because our installation is automatic using yum. Should I raise a defect for this? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2b39d430-e761-4e73-99c4-6de3a3501dd8%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Is there a help document about bigdesk plugin?
Hi, explanation about how to use it can be found on github pages or bigdesk.orgweb site. There is no single document explaining individual charts, but we can start creating one. Feel free to ask. Regards, Lukáš Dne 10.1.2014 7:35 Eric Lu lzy3...@gmail.com napsal(a): Or some detail introduction about the various charts? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b9a893a8-9c16-4d2d-9efe-7218a4895896%40googlegroups.com . For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAO9cvUb0MjLCf9UFUoeRx6PLBPP9NK4OHafYyOVDfuJo%3DbUd1Q%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.