Hi Nick, I thought the same thing. I will try to keep the public method signatures unchanged and will send updates on my progress.
On Tue, Mar 7, 2017 at 5:48 PM, Nick Burch <apa...@gagravarr.org> wrote: > On Tue, 7 Mar 2017, Thejan Wijesinghe wrote: > >> I have already use the Tess4j API to rewrite the TesseractOCRParser class, >> Although It successfully extracts content from most of the file types, it >> fails some particular unit tests in the TesseractOCRParserTest class. I >> can >> solve that. However, I want to know whether I can rewrite the entire >> TesseractOCRParser class from the ground up, but if I do that there will >> be >> many broken links in the internals of TIKA because as I witnessed, most of >> the classes use TesseractOCRParser class indirectly. >> > > If you can, try to keep the public methods unchanged. That way, other > callers to the class will be unaffected by your re-write of the internal > logic > > Nick >