Using Tika with another OCR engine

Cristian Zamfir Thu, 03 Aug 2023 03:13:28 -0700

Hello,

I am interested in trying out Tika with a different OCR engine and
wondering how Tesseract is integrated. Is it possible to write a plugin to
call a different engine? While for images it is much easier, can just
detect the file type and use an OCR engine instead, for scanned PDFs, I
assume there is some bi-directional communication between Tika and
Tesseract to detect inline images. Is that correct?


Thanks,
Cristi

Using Tika with another OCR engine

Reply via email to