
I have the requirement to index internationalized fields ('name') with Solr.
For this purpose, I want to use dynamic fields and have e.g. 'name_en', 
'name_de', 'name_fr' in my Solr documents.

When querying the index, I need to know which language a match was found in. 
For this, I want to use Solr highlighting.

My problem is now, that the highlighting seems to work inconsistently which is 
a problem in my use case.
The field configuration for e.g. my dynamic field '*_en' field is as follows:

<dynamicField name="*_en"  type="text_en"    indexed="true"  stored="true" 

The field type 'text_en' is configured as follows:

<fieldType name="text_en" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <!-- in this example, we will only use synonyms at query time
        <filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" 
ignoreCase="true" expand="false"/>
        <!-- Case insensitive stop word removal.
        <filter class="solr.StopFilterFactory"
        <filter class="solr.LowerCaseFilterFactory"/>
                <filter class="solr.EnglishPossessiveFilterFactory"/>
        <filter class="solr.KeywordMarkerFilterFactory" 
                <!-- Optionally you may want to use this less aggressive 
stemmer instead of PorterStemFilterFactory:
        <filter class="solr.EnglishMinimalStemFilterFactory"/>
        <filter class="solr.PorterStemFilterFactory"/>
      <analyzer type="query">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" 
ignoreCase="true" expand="true"/>
        <filter class="solr.StopFilterFactory"
        <filter class="solr.LowerCaseFilterFactory"/>
                <filter class="solr.EnglishPossessiveFilterFactory"/>
        <filter class="solr.KeywordMarkerFilterFactory" 
                <!-- Optionally you may want to use this less aggressive 
stemmer instead of PorterStemFilterFactory:
        <filter class="solr.EnglishMinimalStemFilterFactory"/>
        <filter class="solr.PorterStemFilterFactory"/>

My index contains the following document:

<int name="id">25</int>
<str name="name_it">Note Test</str>
<str name="description_it"/>
<str name="name_en">Note Test Translation</str>
<str name="description_en"/>
<long name="_version_">1504065955969368064</long>

The query defType=edismax&q=Translation&hl=on&hl.fl=name_* returns the above 
document but does not highlight anything.
The query defType=edismax&q=name_en:Translation&hl=on&hl.fl=name_* returns the 
above document AND highlights 'Translation' as expected.
Since translation does occur in any other field, I do not understand how the 
match could have occurred on a different than 'name_en' (which would explain 
why 'name_en' is not highlighted).
I already tried:

Neither worked.

Moreover, when I run defType=edismax&q=Note&hl=on&hl.fl=name_* the result is
<int name="id">25</int>
<str name="name_it">Note Test</str>
<str name="description_it"/>
<str name="name_en">Note Test Translation</str>
<str name="description_en"/>
<long name="_version_">1504067222466723840</long>
<int name="id">27</int>
<str name="name_de">Note Test child</str>
<str name="description_de"/>
<long name="_version_">1504067222528589824</long>

However, the highlighting only contains fields of document 25 but not 27:

<lst name="highlighting">
<lst name="25">
<arr name="name_it">
<str>&lt;em&gt;Note&lt;/em&gt; Test</str>
<arr name="name_en">
<str>&lt;em&gt;Note&lt;/em&gt; Test Translation</str>

I really do not understand what is happening here and what I can do to make the 
highlighting consistent.
Also, is my approach with the 'name_en', 'name_de', ... for localized field 
indexing reasonable or is there a much more preferable way?

Thank you for your help and best regards

Moritz Becker

curecomp Software Services GmbH
Hafenstrasse 47-51
4020 Linz

web: www.curecomp.com<http://www.curecomp.com/>
e-Mail: m.bec...@curecomp.com<mailto:m.bec...@curecomp.com>

[Beschreibung: Beschreibung: premium SRM for premium customers]

Reply via email to