Hi Antoine,
I'll permit myself to respond in English, cause my written French is
slower;-)
Your problem is a well known amongst Sold users, the query parser splits
tokens by empty space, so the analyser never sees input 'la redoutte' but
it receives 'la' 'reroute'. You can of course enclose your search in quotes
like ”la redoutte" but it is hard to force your users to do the same....I
have solved this and related problems for our astrophysics system by
writing a better query parser that does search both for individual tokens
and for phrases, so essentially the parser decides when to join tokens
together - and this takes care also of multi-token synonyms, because
synonym recognition is related issue, it happens in the analysis phase, and
that one comes after parsing. The code is there in lucene-5014 and I'll
perhaps make it available as a simple jar that you can drop inside solr,
but impossible to do sion, it is too busy.... But I hope the explanation
will help you to search for a solution, you need to make sure that your
analysis chain sees 'la redoutte' and then, because you are using
whitespace tokenizer, you need to define the synonyms laredoutte,la\
redoutte

Hth

Roman
On 4 Nov 2013 11:48, "Antoine REBOUL" <antoine.reb...@gmail.com> wrote:

> Bonjour,
>
> je souhaite faire en sorte que les recherches dans un champs de type texte
> renvoient des résultats même si les espaces sont mal saisies
> (par exemple : "la redoute"="laredoute").
>
> Aujourd'hui mon champ texte est défini de la façon suivante :
>
>
> <fieldType name="text" class="solr.TextField" positionIncrementGap="100">
>  <analyzer type="index">
> <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>  <filter class="solr.ISOLatin1AccentFilterFactory"/>
> <filter class="solr.StopFilterFactory"
>  ignoreCase="true"
> words="stopwords.txt"
> enablePositionIncrements="true"
>  />
> <filter class="solr.ElisionFilterFactory" articles="elisions.txt"/>
>  <filter class="solr.SynonymFilterFactory" synonyms="synonyms2.txt"
> ignoreCase="true" expand="false"/>
> <filter class="solr.ASCIIFoldingFilterFactory"/>
>  <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="1"
> generateNumberParts="1"
>  catenateWords="1"
> catenateNumbers="1"
> catenateAll="1"
>  splitOnCaseChange="1"
> splitOnNumerics="1"
> preserveOriginal="1"
>  />
> <filter class="solr.LowerCaseFilterFactory"/>
> </analyzer>
>  <analyzer type="query">
> <filter class="solr.ISOLatin1AccentFilterFactory"/>
>  <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> <filter class="solr.WordDelimiterFilterFactory"
>  generateWordParts="1"
> generateNumberParts="1"
> catenateWords="1"
>  catenateNumbers="0"
> catenateAll="1"
> splitOnCaseChange="1"
>  preserveOriginal="1"
> />
> <filter class="solr.StopFilterFactory"
>  ignoreCase="true"
> words="stopwords.txt"
> enablePositionIncrements="true"
>  />
> <filter class="solr.ElisionFilterFactory" articles="elisions.txt"/>
>  <filter class="solr.ASCIIFoldingFilterFactory"/>
> <filter class="solr.LowerCaseFilterFactory"/>
>  <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
> </analyzer>
> </fieldType>
>
>
>
>
>
>
> Merci d'avance pour vos éventuelles réponses.
> Cordialement.
>
> Antoine Reboul
> *
>

Reply via email to