The same problem for me. So, first case probably or how to force tika
parser recognize cyrillic character as required. For me it tries to
recognize russian text as eng translit, show up in result russian text
utilize only latin alphabet.
10 февр. 2017 г. 17:55 пользователь "Alexandre Rafalovitch"
At what level is this exactly a problem? Are you looking for a way for Solr
to pass -L rus flag to Tika?
Or you are saying that whatever OCR is used here is bad. In the second
case, this is probably not a question for Solr or even Tika but for
whatever underlying OCR library is.
The stack is deep
Hello, everyone I'm encountered the error mentioned at the title?
The original image attached and recognized text below:
3ApaBCTyI7ITe 9| )KVIBy xopomo
Does anyone faced the similar?
Need to mentioned that tesseract recognize it more correctly with -l rus option.
Thanks in advance!
С уважением,