same problem with Solr 4.0 2011/12/8 elisabeth benoit <elisaelisael...@gmail.com>
> > > Hello, > > I'm using Solr 3.4, and I'm having a problem with a request returning > different results if I have or not a space after a coma. > > The request "name, number rue taine paris" returns results with 4 words > out of 5 matching ("name", "number", "rue", "paris") > > The request "name,number rue taine paris" (no space between coma and > "number") returns no results, unless I set mm=3, and then matching words > are "rue", "taine", "paris". > > If I check in the solr.admin.analyzer, I get the same analysis for the two > different requests. But it seems, if fact, that the lacking space after > coma prevents name and number from matching. > > > My field type is > > > <analyzer type="query"> > <!-- découpage standard --> > <tokenizer class="solr.StandardTokenizerFactory"/> > <!-- normalisation des accents, cédilles, e dans l'o,... --> > <charFilter class="solr.MappingCharFilterFactory" > mapping="mapping-ISOLatin1Accent.txt"/> > <filter class="solr.ASCIIFoldingFilterFactory"/> > <!-- suppression des . (I.B.M. => IBM) --> > <filter class="solr.StandardFilterFactory"/> > <!-- passage en minuscules --> > <filter class="solr.LowerCaseFilterFactory"/> > <!-- suppression de la ponctuation --> > <filter class="solr.PatternReplaceFilterFactory" > pattern="^(\p{Punct}*)(.*?)(\p{Punct}*)$" replacement="$2"/> > <!-- suppression des tokens vides et des mots démesurés --> > <filter class="solr.LengthFilterFactory" min="1" max="100" /> > <!-- découpage des mots composés --> > <filter class="solr.WordDelimiterFilterFactory" > splitOnCaseChange="1" splitOnNumerics="1" stemEnglishPossessive="1" > generateWordParts="1" > > generateNumberParts="1" catenateWords="0" catenateNumbers="1" > catenateAll="0" preserveOriginal="1"/> > <!-- suppression des élisions (l', qu',...) --> > <filter class="solr.ElisionFilterFactory" > articles="elisionwords.txt"/> > <!-- suppression des mots insignifiants --> > <filter class="solr.StopFilterFactory" ignoreCase="1" > words="stopwords.txt" enablePositionIncrements="true"/> > <!-- lemmatisation (pluriels,...) --> > <filter class="solr.SnowballPorterFilterFactory" language="French" > protected="protwords.txt"/> > <!-- suppression des doublons éventuels --> > <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> > </analyzer> > > Anyone has a clue? > > Thanks, > Elisabeth >