[ https://issues.apache.org/jira/browse/TIKA-2106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tim Allison resolved TIKA-2106. ------------------------------- Resolution: Fixed Thank you! > "hocr" case on Linux fails, but works on OSX. Related to TIKA-2093 > ------------------------------------------------------------------- > > Key: TIKA-2106 > URL: https://issues.apache.org/jira/browse/TIKA-2106 > Project: Tika > Issue Type: Bug > Components: ocr > Environment: Bug in Linux, but fine in OSX. > Reporter: Eric Pugh > Assignee: Tim Allison > > We pass a output type, either TXT or HOCR to the Tesseract command line. > When we call the command line we lowercase it to "txt" or "hocr". However, > when we read back in the output, we don't lower case it. on OSX the > constructed file path "output.HOCR" is actually found, but in Linux it > doesn't. This patch lower cases the HOCR to hocr and TXT to txt in the > constructed file path. > I didn't write a unit test as I don't have a good linux env to test it in, > but I was able to put a patched version of the Tika Parser Jar into my Docker > Build to test it works. -- This message was sent by Atlassian JIRA (v6.3.4#6332)