[ https://issues.apache.org/jira/browse/TIKA-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tyler Palsulich updated TIKA-1405: ---------------------------------- Summary: Uppercase content detected as Estonian (was: German content detected as French) > Uppercase content detected as Estonian > -------------------------------------- > > Key: TIKA-1405 > URL: https://issues.apache.org/jira/browse/TIKA-1405 > Project: Tika > Issue Type: Bug > Components: languageidentifier > Affects Versions: 1.4 > Environment: Linux > Reporter: Zaheer Beig > Labels: newbie > > Hi, > We are using Apache Tika 1.4 for document conversion to text and language > detection in one of our project. We are facing below issue with language > detection: > 1. When the text is in all UPPER CASE, even though the language is English, > it gets detected as Estonian. > Any update on this will be very helpful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)