[jira] [Updated] (TIKA-1696) Language Identification with Text Processing Toolkit from MITLL

2016-01-24 Thread Chris A. Mattmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/TIKA-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris A. Mattmann updated TIKA-1696:

Fix Version/s: (was: 1.12)
   1.13

> Language Identification with Text Processing Toolkit from MITLL
> ---
>
> Key: TIKA-1696
> URL: https://issues.apache.org/jira/browse/TIKA-1696
> Project: Tika
>  Issue Type: New Feature
>  Components: languageidentifier
>Reporter: Paul Ramirez
> Fix For: 1.13
>
>
> The aim here is to extend the methods for language identification within 
> text. MIT Lincoln Labs has an open source library [1] written in Julia. 
> Having spoken  with the MITLL guys there is a possibility that there is a 
> scala version of this library which would make it easier to package in with 
> Tika. 
> At this point I'm not quite sure how many languages this library supports by 
> default but it can be extended when provided some training data.
> [1] https://github.com/mit-nlp/Text.jl



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TIKA-1696) Language Identification with Text Processing Toolkit from MITLL

2015-10-18 Thread Chris A. Mattmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/TIKA-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris A. Mattmann updated TIKA-1696:

Fix Version/s: (was: 1.11)
   1.12

> Language Identification with Text Processing Toolkit from MITLL
> ---
>
> Key: TIKA-1696
> URL: https://issues.apache.org/jira/browse/TIKA-1696
> Project: Tika
>  Issue Type: New Feature
>  Components: languageidentifier
>Reporter: Paul Ramirez
> Fix For: 1.12
>
>
> The aim here is to extend the methods for language identification within 
> text. MIT Lincoln Labs has an open source library [1] written in Julia. 
> Having spoken  with the MITLL guys there is a possibility that there is a 
> scala version of this library which would make it easier to package in with 
> Tika. 
> At this point I'm not quite sure how many languages this library supports by 
> default but it can be extended when provided some training data.
> [1] https://github.com/mit-nlp/Text.jl



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TIKA-1696) Language Identification with Text Processing Toolkit from MITLL

2015-08-08 Thread Dave Meikle (JIRA)

 [ 
https://issues.apache.org/jira/browse/TIKA-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Meikle updated TIKA-1696:
--
Fix Version/s: (was: 1.10)
   1.11

* Pushed to 1.11 following 1.10 release

 Language Identification with Text Processing Toolkit from MITLL
 ---

 Key: TIKA-1696
 URL: https://issues.apache.org/jira/browse/TIKA-1696
 Project: Tika
  Issue Type: New Feature
  Components: languageidentifier
Reporter: Paul Ramirez
 Fix For: 1.11


 The aim here is to extend the methods for language identification within 
 text. MIT Lincoln Labs has an open source library [1] written in Julia. 
 Having spoken  with the MITLL guys there is a possibility that there is a 
 scala version of this library which would make it easier to package in with 
 Tika. 
 At this point I'm not quite sure how many languages this library supports by 
 default but it can be extended when provided some training data.
 [1] https://github.com/mit-nlp/Text.jl



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)