Cool. Well with this one I found, along with language-detector, along with Ramirez and the work with Joe Campbell’s group at MIT-LL and the Julia stuff, I for one am going to take the step to make it pluggable.
I’ll try and take this on over the next week. I’ll use a ServiceLoader approach similar to Translators, Detectors, Parsers, etc. Cheers, Chris ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: [email protected] WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ -----Original Message----- From: Ken Krugler <[email protected]> Reply-To: "[email protected]" <[email protected]> Date: Tuesday, July 28, 2015 at 5:39 PM To: "[email protected]" <[email protected]> Subject: RE: Bayesian N-Gram Language Detection >I think switching to language-detector is a reasonable first step (more >languages, faster, better accuracy), after which we can evaluate the need >to make it pluggable. > >There were some code & resource packaging issues with the original >project, but the fork I've been trying out seems much better. > >See https://github.com/optimaize/language-detector > >Still ALv2, and already in the Maven central repo. > >-- Ken > >> From: Mattmann, Chris A (3980) >> Sent: July 28, 2015 5:30:00pm PDT >> To: [email protected] >> Subject: Bayesian N-Gram Language Detection >> >> FYI the code is ALv2: >> >> https://github.com/shuyo/language-detection/blob/wiki/ProjectHome.md >> >> >> I’m going to test this out and see how it compares with our own. >> Maybe we need to make the Language Detector pluggable too. >> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> Chris Mattmann, Ph.D. >> Chief Architect >> Instrument Software and Science Data Systems Section (398) >> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >> Office: 168-519, Mailstop: 168-527 >> Email: [email protected] >> WWW: http://sunset.usc.edu/~mattmann/ >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> Adjunct Associate Professor, Computer Science Department >> University of Southern California, Los Angeles, CA 90089 USA >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> >> > >-------------------------- >Ken Krugler >+1 530-210-6378 >http://www.scaleunlimited.com >custom big data solutions & training >Hadoop, Cascading, Cassandra & Solr > > > > >
