What leads you to believe that the user is not interested in occurrences of
the French phrase in English text? I mean, we English-speakers and writers
like to use French phrases to show how sophisticated we are! It's part of
our... raison d'être. If I do a Google search for "raison d'être", it
doesn't mysteriously show me only French documents.
So, usually, it needs to be a user preference - the user's preferred
language, and whether they want to search across documents in all languages
or just a subset of languages. And then, on the results page you can show
the language and a button to restrict a re-query to the specific language.
If you really need to do this query language detection, the best approach is
to do it within your application layer (you can use the Google code for
language detection) and then send the query to the appropriate query request
handler, with a separate query request handler for each language that
optimizes the settings for that language, such as the language-specific
fields to use for the "qf" parameter.
-- Jack Krupansky
-----Original Message-----
From: benjelloun
Sent: Friday, July 4, 2014 10:52 AM
To: solr-user@lucene.apache.org
Subject: multilingual search
Hello,
what i need to do is to detect language of my fields then when i search with
"/select RequestHandler"
how can i define for a search to detect the language of words to choose
which field_langid use.
my conf:
<updateRequestProcessorChain name="langid">
<processor
class="org.apache.solr.update.processor.LangDetectLanguageIdentifierUpdateProcessorFactory">
<lst name="defaults">
<bool name="langid">true</bool>
<str name="langid.fl">NomDocument,ContenuDocument,Postit,
</str>
<str name="langid.langField">language_s</str>
<str name="langid.whitelist">en,fr,ar</str>
<str name="langid.fallback">fr</str>
<float name="langid.threshold">0.6</float>
<bool name="langid.map">true</bool>
<bool name="langid.map.individual">true</bool>
<bool name="langid.map.keepOrig">true</bool>
</lst>
</processor>
<field name="AllChamp_ar" type="text_ar" multiValued="true" indexed="true"
required="false" stored="false"/>
<field name="AllChamp_fr" type="text_fr" multiValued="true" indexed="true"
required="false" stored="false"/>
<field name="AllChamp_en" type="text_en" multiValued="true" indexed="true"
required="false" stored="false"/>
<dynamicField name="*_en" type="text_en" indexed="true" stored="false"
required="false" multiValued="true"/>
<dynamicField name="*_fr" type="text_fr" indexed="true" stored="false"
required="false" multiValued="true"/>
<dynamicField name="*_ar" type="text_ar" indexed="true" stored="false"
required="false" multiValued="true"/>
<copyField source="*_ar" dest="AllChamp_ar"/>
<copyField source="*_fr" dest="AllChamp_fr"/>
<copyField source="*_en" dest="AllChamp_en"/>
<requestHandler name="/select" class="solr.SearchHandler">
<lst name="defaults">
<str name="echoParams">explicit</str>
<int name="rows">10</int>
<str name="defType">edismax</str>
<str name="qf">
AllChamp^2.0 AllChamp_ar^2.0 AllChamp_en^2.0 AllChamp_fr^5.0
</str>
</lst>
</requestHandler>
exemple for search in Solr Admin: "nous présentons" it is frensh language.
and "nous" is a stopwords_fr.
but when i search for "nous présontons" i find nous becaus i have some
english docs which contain "nous".
this is just one exemple for on language. i dont want to add stopwords_fr in
stopwords_en.
what i want is to detect the language before the select search then choose
the field_langid for search.
Best regards,
Anass BENJELLOUN
--
View this message in context:
http://lucene.472066.n3.nabble.com/multilingual-search-tp4145639.html
Sent from the Solr - User mailing list archive at Nabble.com.