The next thing to do is add debugQuery=true to your URL (or enable it in the query pane of the admin UI). Then look for the parsed query info.
On the standard text_en field which includes an English stop word filter, I ran a query on "Jack and Jill's House" which showed this output: "rawquerystring": "text_en:(Jack and Jill's House)", "querystring": "text_en:(Jack and Jill's House)", "parsedquery": "text_en:jack text_en:jill text_en:hous", "parsedquery_toString": "text_en:jack text_en:jill text_en:hous", You can see that the parsed query is formed *after* analysis, so you can see exactly what is being queried for. Also, as a corollary to this, you can use the schema browser (or faceting for that matter) to view what terms are being indexed, to see if they should match. HTH Upayavira > Am 11.06.2015 12:00 schrieb Upayavira: >> Have you used the analysis tab in the admin UI? You can type in sentences for both index and query time and see how they would be analysed by various fields/field types. Once you have got index time and query time to result in the same tokens at the end of the analysis chain, you should start seeing matches in your queries. Upayavira On Thu, Jun 11, 2015, at 10:26 AM, Thomas Michael Engelke wrote: >>> Hey, in german, you can string most nouns together by using hyphens, >>> like this: Industrie = industry Anhänger = trailer Industrie- >>> Anhänger = trailer for industrial use Here [1[1]], you can see me >>> querying "Industrieanhänger" from the "name" field >>> (name:Industrieanhänger), to make sure the index actually contains >>> the word. Our data is structured that products are listed without >>> the hyphen. Now, customers can come around and use the hyphenated >>> version as a search term (i.e."industrie-anhänger"), and of course >>> we want them to find what they are looking for. I've set it up so >>> that the WordDelimiterFilterFactory uses catenateWords="1", so that >>> these words are catenated. An analysis of "Industrieanhänger" as >>> index and "industrie-anhänger" as query can be seen here [2[2]]. You >>> can see that both word parts are found. However, querying for "industrie- >>> anhänger" does not yield results, only when the hyphen is removed, >>> as you can see here [3[3]]. I'm not sure how to proceed from here, >>> as the results of the analysis have so far always lined up with what >>> I could see when querying. Here's the schema definition for "text", >>> the field type for the "name" field: <fieldType name="text" >>> class="solr.TextField" positionIncrementGap="100" >>> autoGeneratePhraseQueries="true"> <analyzer type="index"> <tokenizer >>> class="solr.StandardTokenizerFactory"/> <filter >>> class="solr.WordDelimiterFilterFactory" splitOnCaseChange="1" >>> splitOnNumerics="1" generateWordParts="1" generateNumberParts="1" >>> catenateWords="1" catenateNumbers="0" catenateAll="0" >>> preserveOriginal="1"/> <filter class="solr.LowerCaseFilterFactory"/> >>> <filter class="solr.DictionaryCompoundWordTokenFilterFactory" >>> dictionary="dictionary.txt" minWordSize="5" minSubwordSize="3" >>> maxSubwordSize="30" onlyLongestMatch="false"/> <filter >>> class="solr.StopFilterFactory" words="stopwords.txt" >>> ignoreCase="true" enablePositionIncrements="true" >>> format="snowball"/> <filter >>> class="solr.GermanNormalizationFilterFactory"/> <filter >>> class="solr.SnowballPorterFilterFactory" language="German2" >>> protected="protwords.txt"/> <filter >>> class="solr.RemoveDuplicatesTokenFilterFactory"/> </analyzer> >>> <analyzer type="query"> <tokenizer >>> class="solr.WhitespaceTokenizerFactory"/> <filter >>> class="solr.WordDelimiterFilterFactory" splitOnCaseChange="1" >>> splitOnNumerics="1" generateWordParts="1" generateNumberParts="1" >>> catenateWords="1" catenateNumbers="0" catenateAll="0" >>> preserveOriginal="1"/> <filter class="solr.LowerCaseFilterFactory"/> >>> <!-- <filter class="solr.DictionaryCompoundWordTokenFilterFactory" >>> dictionary="dictionary.txt" minWordSize="5" minSubwordSize="3" >>> maxSubwordSize="30" onlyLongestMatch="false"/> --> <filter >>> class="solr.StopFilterFactory" words="stopwords.txt" >>> ignoreCase="true" enablePositionIncrements="true" >>> format="snowball"/> <filter >>> class="solr.GermanNormalizationFilterFactory"/> <filter >>> class="solr.SnowballPorterFilterFactory" language="German2" >>> protected="protwords.txt"/> <filter >>> class="solr.RemoveDuplicatesTokenFilterFactory"/> </analyzer> >>> </fieldType> I've also thought it might be a problem with URL >>> encoding not encoding the hyphen, but replacing it with %2D didn't >>> change the outcome (and was probably wrong anyway). Any help is >>> greatly appreciated. Links: ------ [1] http://imgur.com/2oEC5vz [2] >>> http://i.imgur.com/H0AhEsF.png [3] http://imgur.com/dzmMe7t Links: 1. http://imgur.com/2oEC5vz 2. http://i.imgur.com/H0AhEsF.png 3. http://imgur.com/dzmMe7t