Robert,

Le 2 juin 05, à 21:42, Tansley, Robert a écrit :
It seems that there are even more options --
4/ One index, with a separate Lucene document for each (item,language) combination, with one field that specifies the language 5/ One index, one Lucene document per item, with field names that include the language (e.g. title_en, title_cn) I quite like 4, because you can search with no language constraint, or with one as Paul suggests below.

You can in both cases. In the second, you need to expand the query (ie searching for carrot would search text_en:carrot or text_cn:carrot", which, I think is fair as long as you don't a two kilometer's list of languages.

However, some "non language-specific" data might need to be repeated (e.g. dates), unless we had an extra Lucene document for all that. I wonder what the various pros and cons in terms of index size and performance would be in each case? I really don't have enough knowledge of Lucene to have any idea...

If you separate the indices you won't, as far as I know, be able to query simultaneously (e.g. some text which, as well, is new enough....).

paul


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to