[ 
https://issues.apache.org/jira/browse/TIKA-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16707822#comment-16707822
 ] 

Ken Krugler commented on TIKA-2790:
-----------------------------------

[~talli...@apache.org] - I've compared Yalder to Optimaize's version of 
language-detector. For the EuroParl sample (21 languages, 1000 chunks of text 
for each):

language-detector 1201ms, 0.29% error rate
yalder 591ms, 0.06% error rate
yalder (20 ngram min count) 555ms, 0.085% error rate



 

> Consider switching lang-detection in tika-eval to open-nlp
> ----------------------------------------------------------
>
>                 Key: TIKA-2790
>                 URL: https://issues.apache.org/jira/browse/TIKA-2790
>             Project: Tika
>          Issue Type: Improvement
>            Reporter: Tim Allison
>            Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to