[ https://issues.apache.org/jira/browse/TIKA-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17604903#comment-17604903 ]
Tim Allison commented on TIKA-3850: ----------------------------------- I concur with Nick. For kicks, I ran this with our OpenNLPDetector, and it returned 'spa' as the most likely. > Spanish text is incorrectly detected as Galician > ------------------------------------------------ > > Key: TIKA-3850 > URL: https://issues.apache.org/jira/browse/TIKA-3850 > Project: Tika > Issue Type: Bug > Components: languageidentifier > Affects Versions: 2.4.1 > Environment: org.apache.tika:tika-langdetect-optimaize:2.4.1 > org.apache.tika:tika-core:2.4.1 > Reporter: Lenne Hendrickx > Priority: Minor > > The following Spanish text is incorrectly detected as Galician. > {noformat} > Hola! Donde puedo contactar para una garantÃa?{noformat} > The es and gl models are loaded into the language detector. > Language result: > {noformat} > language: gl > score: 0.999995{noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)