Lenne Hendrickx created TIKA-3850: ------------------------------------- Summary: Spanish text is incorrectly detected as Galician Key: TIKA-3850 URL: https://issues.apache.org/jira/browse/TIKA-3850 Project: Tika Issue Type: Bug Components: languageidentifier Affects Versions: 2.4.1 Environment: org.apache.tika:tika-langdetect-optimaize:2.4.1 org.apache.tika:tika-core:2.4.1 Reporter: Lenne Hendrickx
The following Spanish text is incorrectly detected as Galician. {noformat} Hola! Donde puedo contactar para una garantÃa?{noformat} The es and gl models are loaded into the language detector. Language result: {noformat} language: gl score: 0.999995{noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)