See below:

But also search the archives for multilanguage, this topic has been
discussed
many times before. Lucid Imagination maintains a Solr-powered (of course)
searchable
list at: http://www.lucidimagination.com/search/

<http://www.lucidimagination.com/search/>

On Wed, Oct 20, 2010 at 9:03 AM, Jakub Godawa <jakub.god...@gmail.com>wrote:

> Hi everyone! (my first post)
>
> I am new, but really curious about usefullness of lucene/solr in documents
> search from the web applications. I use Ruby on Rails to create one, with
> plugin "acts_as_solr_reloaded" that makes connection between web app and
> solr easy.
>
> So I am in a point, where I know that good solution is to prepare
> multi-language documents with fields like:
> question_en, answer_en,
> question_fr, answer_fr,
> question_pl,  answer_pl... etc.
>
> I need to create an index that would work with 6 languages: english,
> french,
> german, russian, ukrainian and polish.
>
> My questions are:
> 1. Is it doable to have just one search field that behaves like Google's
> for
> all those documents? It can be an option to indicate a language to search.
>

This depends on what you mean by do-able. Are you going to allow a French
user to search an English document (& etc)? But the real answer is "yes, you
can
if you .....". There'll be tradeoffs.

Take a look at the dismax handler. It's kind of hard to grok all at once,
but you
can cause it to search across multiple fields. That is, the user types
"language",
and you can turn it into a complex query under the covers like
lang_en:language lang_fr:language lang_ru:language, etc. You can also
apply boosts. Note that this has obvious problems with, say, Russian. Half
your
job will be figuring out what will satisfy the user.....

You could also have a #different# dismax handler defined for various
languages. Say
the user was coming from Spanish. Consider a browseES handler. See
solrconfig.xml
for the default dismax handler. The Solr book mentioned above describes
this.


> 2. How should I begin changing the solr/conf/schema.xml (or other) file to
> tailor it to my needs? As I am a real rookie here, I am still a bit
> confused
> about "fields", "fieldTypes" and their connection with particular field
> (ex.
> answer_fr) and the "tokenizers" and "analyzers". If someone can provide a
> basic step by step tutorial on how to make it work in two languages I would
> be more that happy.
>

You have several choices here:
> books "Lucene in Action" and "Solr 1.4, Enterprise SearchServer" both have
discussions here.
> Spend some time on the solr/admin/analysis page. That page allows you to
see
   pretty much exactly what each of the steps in an analyzer chain
accomplish.


> 3. Do all those languages are supported (officially/unofficialy) by
> lucene/solr?
>

See:
http://lucene.apache.org/java/3_0_2/api/all/org/apache/lucene/analysis/Analyzer.html
Remember that Solr is built on Lucene, so these analyzers are available.


>
> Thank you for help,
> Jakub Godawa.
>

Best
Erick

Reply via email to