Hello - I am having a problem indexing and searching for words that may or may not contain whitespace...Below is an example
Here is how the index is created: curl -s -XPUT 'localhost:9200/test/name/1' -d '{ "street": "Lakeshore Dr" }' curl -s -XPUT 'localhost:9200/test/name/2' -d '{ "street": "Sunnyshore Dr" }' curl -s -XPUT 'localhost:9200/test/name/3' -d '{ "street": "Lake View Dr" }' curl -s -XPUT 'localhost:9200/test/name/4' -d '{ "street": "Shore Dr" }' If I want to query for record 1/"Lakeshore Dr", I can using the following query: curl -s -XGET 'localhost:9200/test/name/_search?pretty=true' -d '{ "query":{ "bool":{ "must":[ { "match":{ "street":{ "query":"lakeshore dr", "type":"phrase" } } } ] } } }'; This returns the desired result of document id 1. But if a user searches for "Lake Shore Dr" (a space between Lake and Shore), it is still desired to return document id 1. And the inverse of this problem is if a user searches for "Lakeview Dr" (but indexed as "Lake View Dr"): curl -s -XGET 'localhost:9200/test/name/_search?pretty=true' -d '{ "query":{ "bool":{ "must":[ { "match":{ "street":{ "query":"lakeview dr", "type":"phrase" } } } ] } } }'; The search matches to no documents. If the search is changed to a booleansearch instead of a phrase , many docs will match on "dr", but doc #3, "Lake Shore" is not necessarily returned as the top match. NGrams at index time?? Ngrams at search time?? Remove whitespace at index time/search time?? Any suggestions would be appreciated. Thanks. -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/06538a83-17d1-446c-9b27-cebf12c6fc47%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.