[ https://issues.apache.org/jira/browse/SOLR-11773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Cassandra Targett updated SOLR-11773: ------------------------------------- Component/s: contrib - Solr Cell (Tika extraction) > configurable language config for tesseract ocr > ---------------------------------------------- > > Key: SOLR-11773 > URL: https://issues.apache.org/jira/browse/SOLR-11773 > Project: Solr > Issue Type: Improvement > Components: contrib - Solr Cell (Tika extraction) > Affects Versions: 7.1 > Reporter: Advokat > Priority: Minor > > Currently to change the language for tesseract I have to manipulate the > \org\apache\tika\parser\ocr\TesseractOCRConfig.properties in > tika-parsers-1.16.jar. > There is no possibility to set the language in solrconfig.xml or on each > request to the ExtractingRequestHandler. > If someone has documents with different languages its impossible to configure > this. Tesseract will not work as good as it could with correct set language. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org