Just so I get this right is it then a one to one mapping with LanguageProfile 
and training data? The code I'm looking at now allows one to train on multiple 
languages.

Thanks,
Pual

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Paul Ramirez, M.S.
Technical Group Supervisor
Computer Science for Data Intensive Applications (398M)
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 158-264, Mailstop: 158-242
Email: [email protected]<mailto:[email protected]>
Office: 818-354-1015
Cell: 818-395-8194
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

On Aug 3, 2015, at 7:37 PM, "Mattmann, Chris A (3980)" 
<[email protected]<mailto:[email protected]>>
 wrote:

Thanks Oleg

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: [email protected]<mailto:[email protected]>
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++





-----Original Message-----
From: Oleg Tikhonov <[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Wednesday, July 29, 2015 at 12:01 AM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: Re: Bayesian N-Gram Language Detection

+1 !!!
My two cents.
Please also add ability to change/retrain/tote language profiles.

Thanks !!!
BR,
Oleg

On Wed, Jul 29, 2015 at 3:59 AM, Mattmann, Chris A (3980) <
[email protected]<mailto:[email protected]>> wrote:

Cool. Well with this one I found, along with language-detector,
along with Ramirez and the work with Joe Campbell’s group at MIT-LL
and the Julia stuff, I for one am going to take the step to make it
pluggable.

I’ll try and take this on over the next week. I’ll use a ServiceLoader
approach similar to Translators, Detectors, Parsers, etc.

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: [email protected]<mailto:[email protected]>
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++





-----Original Message-----
From: Ken Krugler 
<[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Tuesday, July 28, 2015 at 5:39 PM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: RE: Bayesian N-Gram Language Detection

I think switching to language-detector is a reasonable first step (more
languages, faster, better accuracy), after which we can evaluate the
need
to make it pluggable.

There were some code & resource packaging issues with the original
project, but the fork I've been trying out seems much better.

See https://github.com/optimaize/language-detector

Still ALv2, and already in the Maven central repo.

-- Ken

From: Mattmann, Chris A (3980)
Sent: July 28, 2015 5:30:00pm PDT
To: [email protected]<mailto:[email protected]>
Subject: Bayesian N-Gram Language Detection

FYI the code is ALv2:

https://github.com/shuyo/language-detection/blob/wiki/ProjectHome.md


I’m going to test this out and see how it compares with our own.
Maybe we need to make the Language Detector pluggable too.

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: [email protected]
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++



--------------------------
Ken Krugler
+1 530-210-6378
http://www.scaleunlimited.com
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr









Reply via email to