On Tue, 7 Mar 2017, Thejan Wijesinghe wrote:
I have already use the Tess4j API to rewrite the TesseractOCRParser class, Although It successfully extracts content from most of the file types, it fails some particular unit tests in the TesseractOCRParserTest class. I can solve that. However, I want to know whether I can rewrite the entire TesseractOCRParser class from the ground up, but if I do that there will be many broken links in the internals of TIKA because as I witnessed, most of the classes use TesseractOCRParser class indirectly.
If you can, try to keep the public methods unchanged. That way, other callers to the class will be unaffected by your re-write of the internal logic
Nick