I haven't tried this as an UpdateProcessor but it relies on Tika and that 
LanguageIdentifier works well, except for short texts.

> Thanks Markus.
> 
> Do you know if this patch is good enough for production use? Thanks.
> 
> Andy
> 
> --- On Tue, 3/29/11, Markus Jelsma <markus.jel...@openindex.io> wrote:
> > From: Markus Jelsma <markus.jel...@openindex.io>
> > Subject: Re: copyField at search time / multi-language support
> > To: solr-user@lucene.apache.org
> > Cc: "Andy" <angelf...@yahoo.com>
> > Date: Tuesday, March 29, 2011, 1:29 AM
> > https://issues.apache.org/jira/browse/SOLR-1979
> > 
> > > Tom,
> > > 
> > > Could you share the method you use to perform language
> > 
> > detection? Any open
> > 
> > > source tools that do that?
> > > 
> > > Thanks.
> > > 
> > > --- On Mon, 3/28/11, Tom Mortimer <t...@flax.co.uk>
> > 
> > wrote:
> > > > From: Tom Mortimer <t...@flax.co.uk>
> > > > Subject: copyField at search time /
> > 
> > multi-language support
> > 
> > > > To: solr-user@lucene.apache.org
> > > > Date: Monday, March 28, 2011, 4:45 AM
> > > > Hi,
> > > > 
> > > > Here's my problem: I'm indexing a corpus with
> > 
> > text in a
> > 
> > > > variety of
> > > > languages. I'm planning to detect these at index
> > 
> > time and
> > 
> > > > send the
> > > > text to one of a suitably-configured field (e.g.
> > > > "mytext_de" for
> > > > German, "mytext_cjk" for Chinese/Japanese/Korean
> > 
> > etc.)
> > 
> > > > At search time I want to search all of these
> > 
> > fields.
> > 
> > > > However, there
> > > > will be at least 12 of them, which could lead to
> > 
> > a very
> > 
> > > > long query
> > > > string. (Also I need to use the standard query
> > 
> > parser
> > 
> > > > rather than
> > > > dismax, for full query syntax.)
> > > > 
> > > > Therefore I was wondering if there was a way to
> > 
> > copy fields
> > 
> > > > at search
> > > > time, so I can have my mytext query in a single
> > 
> > field and
> > 
> > > > have it
> > > > copied to mytext_de, mytext_cjk etc. Something
> > 
> > like:
> > > >    <copyQueryField source="mytext"
> > > >
> > > > dest="mytext_de" />
> > > >
> > > >    <copyQueryField source="mytext"
> > > >
> > > > dest="mytext_cjk" />
> > > >
> > > >   ...
> > > >
> > > > If this is not currently possible, could someone
> > 
> > give me
> > 
> > > > some pointers
> > > > for hacking Solr to support it? Should I
> > 
> > subclass
> > 
> > > > solr.SearchHandler?
> > > > I know nothing about Solr internals at the
> > 
> > moment...
> > 
> > > > thanks,
> > > > Tom

Reply via email to