Rupert Westenthaler created STANBOL-719:
-------------------------------------------

             Summary: Change from Langid to the Langdetect engine as default 
for Language detection
                 Key: STANBOL-719
                 URL: https://issues.apache.org/jira/browse/STANBOL-719
             Project: Stanbol
          Issue Type: Sub-task
            Reporter: Rupert Westenthaler
            Assignee: Rupert Westenthaler


After looking at the documentation and supported languages of both I think that 
we should switch from the LangId Engine (based on Apache Tika Language 
detection) to the Langdetect Engine (based on 
http://code.google.com/p/language-detection/).

Normal users should not notice any difference as both engines create the same 
Annotations. However the later supports considerable more languages.

This change will come along with a lot of changes in the integration tests as 
those check on a lot of places for the LangId Engine. Those need to be changed 
to the Langdetect Engine.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to