Re: Pluggable language detection

2012-04-08 Thread Jan Høydahl
In Solr, we made support for pluggable lang detectors, one being Tika's. See http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/langid/ The detectLanguage() method returns a list of DetectedLanguage objects with a normalized certainty between 0.0 and 1.0. Think it's a step in right

Re: Pluggable language detection

2012-04-08 Thread Mattmann, Chris A (388J)
Hi Jan, It probably makes sense to provide pluggable language detection in Tika, since it's the lower level library, so I am +1 for figuring out a solution to implement it in Tika ville. If no one has started on this in the next few weeks I'll give it a go. Cheers, Chris On Apr 8, 2012, at