Hi! This is a sample setup, close to what I am working with https://gist.github.com/anonymous/6e1457321a8ad78c6af8
As you can see, I am trying to remove the hyphens from all words, so that words like "hand-made" are indexed as "handmade". The goal is to make a search for "handmade" find all documents, containing "hand-made" and vice versa. For some reason it doesn't work, though :( I have also attached 3 sample queries. The expected result would be for all of them to return the same result set. 1) Astonishingly, a search for "Chemie-injenieur" finds 2 results, but a search for "Chemieingenieur" finds none. This is pretty creepy to me, since the char_filter is supposed to strip the hyphens prior to tokenizing in the indexing process. 2) Another creepy fact is that if I specify the searchAnalyzer explicitly, I find no results (see query 3) from this document set 3) Moreover the analyzeAPI shows that the search term "Chemie-ingenieur" gets translated to "Chemieingenieur" using this analyzer 4) And the most creepy facts is that when I run these queries with the actual index data (800+ documents), I get 17 results for "Chemie-ingenieur" and 22 for "Chemieingenieur", where NONE OF THEM OVERLAPS. I.e. I have a total of 39 documents that should be matching either of the queries. Some of the documents that match "Chemie-ingenieur" actually don't contain the word with the hyphen. So I would expect these documents to be contained in both result sets, maybe with a different relevancy score. This is, however, not the case. Please help me get over this, I have been struggling with it for a full week already. I would be very grateful for some explanation too, apart from a solution, since the output is much different that what I expect from my understanding and this means that I don't really understand the system. P.S. Please focus on the actual problem and let's not discuss the mapping into details. The version I have pasted is pretty different than what I have started with initially, due to the try-and-error approach I have been using for almost a week. Thanks sincerely, Georgi -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/417363d0-965f-4398-8174-9889db47d50b%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.