Tom, Could you share the method you use to perform language detection? Any open source tools that do that?
Thanks. --- On Mon, 3/28/11, Tom Mortimer <t...@flax.co.uk> wrote: > From: Tom Mortimer <t...@flax.co.uk> > Subject: copyField at search time / multi-language support > To: solr-user@lucene.apache.org > Date: Monday, March 28, 2011, 4:45 AM > Hi, > > Here's my problem: I'm indexing a corpus with text in a > variety of > languages. I'm planning to detect these at index time and > send the > text to one of a suitably-configured field (e.g. > "mytext_de" for > German, "mytext_cjk" for Chinese/Japanese/Korean etc.) > > At search time I want to search all of these fields. > However, there > will be at least 12 of them, which could lead to a very > long query > string. (Also I need to use the standard query parser > rather than > dismax, for full query syntax.) > > Therefore I was wondering if there was a way to copy fields > at search > time, so I can have my mytext query in a single field and > have it > copied to mytext_de, mytext_cjk etc. Something like: > > <copyQueryField source="mytext" > dest="mytext_de" /> > <copyQueryField source="mytext" > dest="mytext_cjk" /> > ... > > If this is not currently possible, could someone give me > some pointers > for hacking Solr to support it? Should I subclass > solr.SearchHandler? > I know nothing about Solr internals at the moment... > > thanks, > Tom >