Hello, I'm using Solr 3.4, and I'm having a problem with a request returning different results if I have or not a space after a coma.
The request "name, number rue taine paris" returns results with 4 words out of 5 matching ("name", "number", "rue", "paris") The request "name,number rue taine paris" (no space between coma and "number") returns no results, unless I set mm=3, and then matching words are "rue", "taine", "paris". If I check in the solr.admin.analyzer, I get the same analysis for the two different requests. But it seems, if fact, that the lacking space after coma prevents name and number from matching. My field type is <analyzer type="query"> <!-- découpage standard --> <tokenizer class="solr.StandardTokenizerFactory"/> <!-- normalisation des accents, cédilles, e dans l'o,... --> <charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/> <filter class="solr.ASCIIFoldingFilterFactory"/> <!-- suppression des . (I.B.M. => IBM) --> <filter class="solr.StandardFilterFactory"/> <!-- passage en minuscules --> <filter class="solr.LowerCaseFilterFactory"/> <!-- suppression de la ponctuation --> <filter class="solr.PatternReplaceFilterFactory" pattern="^(\p{Punct}*)(.*?)(\p{Punct}*)$" replacement="$2"/> <!-- suppression des tokens vides et des mots démesurés --> <filter class="solr.LengthFilterFactory" min="1" max="100" /> <!-- découpage des mots composés --> <filter class="solr.WordDelimiterFilterFactory" splitOnCaseChange="1" splitOnNumerics="1" stemEnglishPossessive="1" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="1" catenateAll="0" preserveOriginal="1"/> <!-- suppression des élisions (l', qu',...) --> <filter class="solr.ElisionFilterFactory" articles="elisionwords.txt"/> <!-- suppression des mots insignifiants --> <filter class="solr.StopFilterFactory" ignoreCase="1" words="stopwords.txt" enablePositionIncrements="true"/> <!-- lemmatisation (pluriels,...) --> <filter class="solr.SnowballPorterFilterFactory" language="French" protected="protwords.txt"/> <!-- suppression des doublons éventuels --> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> </analyzer> Anyone has a clue? Thanks, Elisabeth