Hey Juls, I'd be super +1 to make it pluggable and willing to help.
Cheers, Chris On Mar 21, 2012, at 4:51 PM, Julien Nioche wrote: > Hi guys, > > Just wondering about the best way to make the language detection pluggable > instead of having it hard-wired as it is now. We now that the resources > that are currently in Tika are both slow and inaccurate [1] and there are > other libraries that we could leverage. Why not having the option to select > a different implementation just like we do for parsers? Obviously we'd need > a common interface for the parsers etc... > > What do you think? > > Julien > > [1] > http://blog.mikemccandless.com/2011/10/accuracy-and-performance-of-googles.html > > -- > * > *Open Source Solutions for Text Engineering > > http://digitalpebble.blogspot.com/ > http://www.digitalpebble.com > http://twitter.com/digitalpebble ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++