Tansley, Robert wrote:
What if we're trying to index multiple languages in the same site? Is
it best to have:
1/ one index for all languages
2/ one index for all languages, with an extra language field so searches
can be constrained to a particular language
3/ separate indices for each language?
I'd use 2/. In particular, use the same field for the content, title,
etc., even if when produced by different analyzers. Have a "lang" field
that names the language of the document.
At query time, use an analyzer selected by the user's environment (e.g.,
HTTP lang header). If folks are getting false positives, where a term
in another language that means something different is matching their
query, they can use a "lang" pulldown to remove documents from other
languages, implemented as a Lucene Filter.
Doug
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]