Re: question about multiple languages

2012-10-09 Thread Erlend Garåsen
On 08.10.12 17.03, Maciej Liżewski wrote: Now there are two possibilities: 1. when fields are untouched - processing data (stemming, etc) is same for every document, which is rather wrong because polish stemming is different from english one... :) 2. attributes are mapped to *_lang and every

Re: question about multiple languages

2012-10-09 Thread Maciej Liżewski
Google does a guessing about the query language. If you hit www.google.com, you will be redirected to www.google.pl if you're sitting in Poland. This may also be achieved in your application by detecting the browser's locale etc. Many web application frameworks have support for this. Then you

Re: question about multiple languages

2012-10-09 Thread Maciej Liżewski
Thanks Erlen for your hints! 2012/10/9 Erlend Garåsen e.f.gara...@usit.uio.no: On 09.10.12 14.19, Maciej Liżewski wrote: Google does a guessing about the query language. If you hit www.google.com, you will be redirected to www.google.pl if you're sitting in Poland. This may also be

question about multiple languages

2012-10-08 Thread Maciej Liżewski
Hi, I would like to know what is the default approach to handle multiple languages in documents? I know that there is a component for update/extract process that can automagically guess the languages and put the language name in attribute and map field names to *_[lang] (I know that this is not

Re: question about multiple languages

2012-10-08 Thread Karl Wright
Hi Maciej, Did you intend to send this to the Solr/Lucene dev list? This really isn't a ManifoldCF question. I can help a little perhaps. You are correct that stemming and normalization rules might well differ from language to language, but it is worth noting that for at least normalization it