Hi,
Here's a field type using synonyms :
<fieldtype name="SFR" class="solr.TextField">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StandardFilterFactory"/>
<filter class="solr.SynonymFilterFactory"
synonyms="french-synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
<charFilter class="solr.MappingCharFilterFactory"
mapping="mapping-ISOLatin1Accent.txt"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StandardFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<charFilter class="solr.MappingCharFilterFactory"
mapping="mapping-ISOLatin1Accent.txt"/>
</analyzer>
</fieldtype>
Here are the contents of 'french-synonyms.txt' that I used for testing :
PC,parti communiste
PS,parti socialiste
When I query a field for the words : parti communiste, those things are
highlighted :
"parti communiste"
"parti socialiste"
"parti"
"PC"
"PS"
"communiste"
Having "parti socialiste" highlighted is a problem.
I expected only "parti communiste", "parti", "communiste" and "PC"
highlighted.
Is there a way to have things working like I expected ?
Here is the query I use :
wt=json
&q=qAndMSFR%3A%28parti%20communiste%29
&q.op=AND
&start=0
&rows=5
&fl=id,studyId,questionFR,modalitiesFR,variableLabelFR,variableName,nesstarVariableId,lang,studyTitle,nesstarStudyId,CevipofConcept,studyQuestionCount,questionPosition,preQuestionText,
&sort=score%20desc
&facet=true
&facet.field=CevipofConceptCode
&facet.field=studyDateAndId
&facet.sort=lex
&spellcheck=true
&spellcheck.collate=on
&spellcheck.count=10
&hl=on
&hl.fl=questionSMFR,modalitiesSMFR,variableLabelSMFR
&hl.fragsize=1
&hl.snippets=100
&hl.usePhraseHighlighter=true
&hl.highlightMultiTerm=true
&hl.simple.pre=%3Cb%3E
&hl.simple.post=%3C%2Fb%3E