Re: How to get Elasticsearch boolean match working for multiple fields
Dominic Normal nomenclature is that Field is analyzed and Field.raw is not analyzed. Not sure why you would have both as not analyzed given they would do the same thing, all else being equal When performing your original query above on fields I know are not_analyzed I get no results because there are no strings in the fields that match those terms exactly. I could of course look to do a regex query GET /testingindex/mytesttype/_search { query: { bool: { must: [ { regexp : { message : .*Failed password for.* } }, { regexp : { path : .*/var/log/secure.* } } ] } } } On 8 May 2015 at 15:03, Dominic Nicholas dominic.s.nicho...@gmail.com wrote: Hi Alan, I really appreciate the thoughtful response. One comment before I try what you are suggesting... Our path and message fields mappings indicate not_analyzed, and we don't want to change them at this point. Someone suggested using the .raw versions of the fields (path.raw and message.raw, which does work. However, it leaves me with the question : If the original field mappings indicate the fields are not_analyzed, why is it necessary to use the .raw version ? Cheers Dom On Fri, May 8, 2015 at 6:37 AM, Allan Mitchell casfanal...@gmail.com wrote: Hi Have a look at the below and see if it is what you want. DELETE /testingindex PUT /testingindex { settings : { number_of_shards : 1 }, mappings : { mytesttype : { _source : { enabled : false }, properties : { message : { type : string, index : analyzed }, path : {type: string, index: analyzed } } } } } POST /testingindex/mytesttype/1 { message: Failed password for some user or another, path:/wrong/path/ } POST /testingindex/mytesttype/2 { message: Not the right message but the right path, path:/var/log/secure } POST /testingindex/mytesttype/3 { message: Failed password for some user or another, path:/var/log/secure } POST /testingindex/mytesttype/4 { message: Nothing is right here, path:/wrong/path/too } GET /testingindex/mytesttype/_search GET /testingindex/mytesttype/_search { query: { bool: { must: [ { match_phrase : { message : Failed password for some } }, { match_phrase : { path : /var/log/secure } } ] } } } On 8 May 2015 at 02:07, Dominic Nicholas dominic.s.nicho...@gmail.com wrote: Hi, I need some expert guidance on trying to get a bool match working. I'd like the query to only return a successful search result if *both* 'message' matches 'Failed password for', *and* 'path' matches '/var/log/secure'. This is my query : curl -s -XGET 'http://localhost:9200/logstash-2015.05.07/syslog/_search?pretty=true' -d '{ filter : { range : { @timestamp : { gte : now-1h } } }, query : { bool : { must : [ { match_phrase : { message : Failed password for } }, { match_phrase : { path: /var/log/secure } } ] } } } ' Here is the start of the output from the search : { took : 3, timed_out : false, _shards : { total : 5, successful : 5, failed : 0 }, hits : { total : 46, max_score : 13.308596, hits : [ { _index : logstash-2015.05.07, _type : syslog, _id : AU0wzLEqqCKq_IPSp_8k, _score : 13.308596, _source:{message:May 7 16:53:50 s_local@logstash-02 sshd[17970]: Failed password for fred from 172.28.111.200 port 43487 ssh2,@version:1,@timestamp:2015-05-07T16:53:50.554-07:00,type:syslog,host:logstash-02,path:/var/log/secure} }, ... The problem is if I change '/var/log/secure' to just 'var' say, and run the query, I still get a result, just with a lower score. I understood the bool...must construct meant both match terms here would need to be successful. What I'm after is *no* result if 'path' doesn't exactly match '/var/log/secure'... { took : 3, timed_out : false, _shards : { total : 5, successful : 5, failed : 0 }, hits : { total : 46, max_score : 10.354593, hits : [ { _index : logstash-2015.05.07, _type : syslog, _id : AU0wzLEqqCKq_IPSp_8k, _score : 10.354593, _source:{message:May 7 16:53:50 s_local@logstash-02 sshd[17970]: Failed password for fred from 172.28.111.200 port 43487 ssh2,@version:1,@timestamp:2015-05-07T16:53:50.554-07:00,type:syslog,host:logstash-02,path:/var/log/secure} },... I checked the mappings for these fields to check that they are not analyzed : curl -X GET 'http://localhost:9200/logstash-2015.05.07/_mapping?pretty=true' I think these fields are non analyzed and so I believe the search will not be analyzed too (based on some
Re: How to get Elasticsearch boolean match working for multiple fields
Hi - thanks again - I was misunderstanding the following : path : { type : string, norms : { enabled : false }, fields : { raw : { type : string, index : not_analyzed, ignore_above : 256 } } } This is saying that the path is analyzed (default analyzer, and no 'index: not_analyzed'), but that the field 'raw' is not analyzed. One solution for me will be to simply use the path.raw field instead of the path field. I'll also try the regexp. Thanks again for the help! Dom On Fri, May 8, 2015 at 10:35 AM, Allan Mitchell casfanal...@gmail.com wrote: Dominic Normal nomenclature is that Field is analyzed and Field.raw is not analyzed. Not sure why you would have both as not analyzed given they would do the same thing, all else being equal When performing your original query above on fields I know are not_analyzed I get no results because there are no strings in the fields that match those terms exactly. I could of course look to do a regex query GET /testingindex/mytesttype/_search { query: { bool: { must: [ { regexp : { message : .*Failed password for.* } }, { regexp : { path : .*/var/log/secure.* } } ] } } } On 8 May 2015 at 15:03, Dominic Nicholas dominic.s.nicho...@gmail.com wrote: Hi Alan, I really appreciate the thoughtful response. One comment before I try what you are suggesting... Our path and message fields mappings indicate not_analyzed, and we don't want to change them at this point. Someone suggested using the .raw versions of the fields (path.raw and message.raw, which does work. However, it leaves me with the question : If the original field mappings indicate the fields are not_analyzed, why is it necessary to use the .raw version ? Cheers Dom On Fri, May 8, 2015 at 6:37 AM, Allan Mitchell casfanal...@gmail.com wrote: Hi Have a look at the below and see if it is what you want. DELETE /testingindex PUT /testingindex { settings : { number_of_shards : 1 }, mappings : { mytesttype : { _source : { enabled : false }, properties : { message : { type : string, index : analyzed }, path : {type: string, index: analyzed } } } } } POST /testingindex/mytesttype/1 { message: Failed password for some user or another, path:/wrong/path/ } POST /testingindex/mytesttype/2 { message: Not the right message but the right path, path:/var/log/secure } POST /testingindex/mytesttype/3 { message: Failed password for some user or another, path:/var/log/secure } POST /testingindex/mytesttype/4 { message: Nothing is right here, path:/wrong/path/too } GET /testingindex/mytesttype/_search GET /testingindex/mytesttype/_search { query: { bool: { must: [ { match_phrase : { message : Failed password for some } }, { match_phrase : { path : /var/log/secure } } ] } } } On 8 May 2015 at 02:07, Dominic Nicholas dominic.s.nicho...@gmail.com wrote: Hi, I need some expert guidance on trying to get a bool match working. I'd like the query to only return a successful search result if *both* 'message' matches 'Failed password for', *and* 'path' matches '/var/log/secure'. This is my query : curl -s -XGET 'http://localhost:9200/logstash-2015.05.07/syslog/_search?pretty=true' -d '{ filter : { range : { @timestamp : { gte : now-1h } } }, query : { bool : { must : [ { match_phrase : { message : Failed password for } }, { match_phrase : { path: /var/log/secure } } ] } } } ' Here is the start of the output from the search : { took : 3, timed_out : false, _shards : { total : 5, successful : 5, failed : 0 }, hits : { total : 46, max_score : 13.308596, hits : [ { _index : logstash-2015.05.07, _type : syslog, _id : AU0wzLEqqCKq_IPSp_8k, _score : 13.308596, _source:{message:May 7 16:53:50 s_local@logstash-02 sshd[17970]: Failed password for fred from 172.28.111.200 port 43487 ssh2,@version:1,@timestamp:2015-05-07T16:53:50.554-07:00,type:syslog,host:logstash-02,path:/var/log/secure} }, ... The problem is if I change '/var/log/secure' to just 'var' say, and run the query, I still get a result, just with a lower score. I understood the bool...must construct meant both match terms here would need to be successful. What I'm after is *no* result if 'path' doesn't exactly match '/var/log/secure'... { took : 3, timed_out : false, _shards : { total : 5, successful : 5, failed : 0 }, hits : { total : 46,
How to get Elasticsearch boolean match working for multiple fields
Hi, I need some expert guidance on trying to get a bool match working. I'd like the query to only return a successful search result if *both* 'message' matches 'Failed password for', *and* 'path' matches '/var/log/secure'. This is my query : curl -s -XGET 'http://localhost:9200/logstash-2015.05.07/syslog/_search?pretty=true' -d '{ filter : { range : { @timestamp : { gte : now-1h } } }, query : { bool : { must : [ { match_phrase : { message : Failed password for } }, { match_phrase : { path: /var/log/secure } } ] } } } ' Here is the start of the output from the search : { took : 3, timed_out : false, _shards : { total : 5, successful : 5, failed : 0 }, hits : { total : 46, max_score : 13.308596, hits : [ { _index : logstash-2015.05.07, _type : syslog, _id : AU0wzLEqqCKq_IPSp_8k, _score : 13.308596, _source:{message:May 7 16:53:50 s_local@logstash-02 sshd[17970]: Failed password for fred from 172.28.111.200 port 43487 ssh2,@version:1,@timestamp:2015-05-07T16:53:50.554-07:00,type:syslog,host:logstash-02,path:/var/log/secure} }, ... The problem is if I change '/var/log/secure' to just 'var' say, and run the query, I still get a result, just with a lower score. I understood the bool...must construct meant both match terms here would need to be successful. What I'm after is *no* result if 'path' doesn't exactly match '/var/log/secure'... { took : 3, timed_out : false, _shards : { total : 5, successful : 5, failed : 0 }, hits : { total : 46, max_score : 10.354593, hits : [ { _index : logstash-2015.05.07, _type : syslog, _id : AU0wzLEqqCKq_IPSp_8k, _score : 10.354593, _source:{message:May 7 16:53:50 s_local@logstash-02 sshd[17970]: Failed password for fred from 172.28.111.200 port 43487 ssh2,@version:1,@timestamp:2015-05-07T16:53:50.554-07:00,type:syslog,host:logstash-02,path:/var/log/secure} },... I checked the mappings for these fields to check that they are not analyzed : curl -X GET 'http://localhost:9200/logstash-2015.05.07/_mapping?pretty=true' I think these fields are non analyzed and so I believe the search will not be analyzed too (based on some training documentation I read recently from elasticsearch). Here is a snippet of the output _mapping for this index below. message : { type : string, norms : { enabled : false }, fields : { raw : { type : string, index : not_analyzed, ignore_above : 256 } } }, path : { type : string, norms : { enabled : false }, fields : { raw : { type : string, index : not_analyzed, ignore_above : 256 } } }, Where am I going wrong (in a bunch of places I'm sure), what am I misunderstanding here (probably a lot!) ? Any help would be much appreciated! Thanks -- Please update your bookmarks! We moved to https://discuss.elastic.co/ --- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0470f9df-8d9a-48ef-9dbd-a90c8f2db194%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: How to get Elasticsearch boolean match working for multiple fields
what es version is that? On Fri, May 8, 2015 at 9:07 AM, Dominic Nicholas dominic.s.nicho...@gmail.com wrote: Hi, I need some expert guidance on trying to get a bool match working. I'd like the query to only return a successful search result if *both* 'message' matches 'Failed password for', *and* 'path' matches '/var/log/secure'. This is my query : curl -s -XGET 'http://localhost:9200/logstash-2015.05.07/syslog/_search?pretty=true' -d '{ filter : { range : { @timestamp : { gte : now-1h } } }, query : { bool : { must : [ { match_phrase : { message : Failed password for } }, { match_phrase : { path: /var/log/secure } } ] } } } ' Here is the start of the output from the search : { took : 3, timed_out : false, _shards : { total : 5, successful : 5, failed : 0 }, hits : { total : 46, max_score : 13.308596, hits : [ { _index : logstash-2015.05.07, _type : syslog, _id : AU0wzLEqqCKq_IPSp_8k, _score : 13.308596, _source:{message:May 7 16:53:50 s_local@logstash-02 sshd[17970]: Failed password for fred from 172.28.111.200 port 43487 ssh2,@version:1,@timestamp:2015-05-07T16:53:50.554-07:00,type:syslog,host:logstash-02,path:/var/log/secure} }, ... The problem is if I change '/var/log/secure' to just 'var' say, and run the query, I still get a result, just with a lower score. I understood the bool...must construct meant both match terms here would need to be successful. What I'm after is *no* result if 'path' doesn't exactly match '/var/log/secure'... { took : 3, timed_out : false, _shards : { total : 5, successful : 5, failed : 0 }, hits : { total : 46, max_score : 10.354593, hits : [ { _index : logstash-2015.05.07, _type : syslog, _id : AU0wzLEqqCKq_IPSp_8k, _score : 10.354593, _source:{message:May 7 16:53:50 s_local@logstash-02 sshd[17970]: Failed password for fred from 172.28.111.200 port 43487 ssh2,@version:1,@timestamp:2015-05-07T16:53:50.554-07:00,type:syslog,host:logstash-02,path:/var/log/secure} },... I checked the mappings for these fields to check that they are not analyzed : curl -X GET 'http://localhost:9200/logstash-2015.05.07/_mapping?pretty=true' I think these fields are non analyzed and so I believe the search will not be analyzed too (based on some training documentation I read recently from elasticsearch). Here is a snippet of the output _mapping for this index below. message : { type : string, norms : { enabled : false }, fields : { raw : { type : string, index : not_analyzed, ignore_above : 256 } } }, path : { type : string, norms : { enabled : false }, fields : { raw : { type : string, index : not_analyzed, ignore_above : 256 } } }, Where am I going wrong (in a bunch of places I'm sure), what am I misunderstanding here (probably a lot!) ? Any help would be much appreciated! Thanks -- Please update your bookmarks! We moved to https://discuss.elastic.co/ --- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0470f9df-8d9a-48ef-9dbd-a90c8f2db194%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/0470f9df-8d9a-48ef-9dbd-a90c8f2db194%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- Please update your bookmarks! We moved to https://discuss.elastic.co/ --- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHO4itwspZ96axDfyoLavndj2wzS_%2BV-UJha%2B893F5nzp%3DZYPA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.