Hi, Your question is a little difficult to understand - it sounds like you are saying on the one hand you have no OCR or image processing background, know Java, and want to modify Tesseract toward some aim that you do not specify?
Tesseract as far as I understand is developed using C/C++ and not Java. Only the Android JNI bindings would be Java. You can find the Tesseract source code at: https://github.com/tesseract-ocr/tesseract In terms of concepts you should read "An Overview of the Tesseract OCR Engine" written by Tesseract's lead Ray Smith as it will give you insight into the algorithms that are employed for its OCR. http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/33418.pdf Further concepts for algorithms can be found in the "Techniques" section at: https://en.wikipedia.org/wiki/Optical_character_recognition Sounds like an uphill struggle to me but I wish you luck! Cheers On 15 June 2016 at 07:28, ravi katiyar <ravirange...@gmail.com> wrote: > Hello All, > > I am new to the world of OCR and image processing as well. I am come from > a java background. > can someone tell what are the pre-requisite to understand the tesseract > code ? > Like java.awt.image package , Digital image processing concepts ? what > would I need to be thorough with so that the I am able to understand > tesseract code . > > I want this understanding because I am aiming to make modifications to > this code , so that tesseract is able to extract text from a movie poster > printed in a newspaper. > Tesseract cannot do this currently. > > Thanks > Ravi Katiyar > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to tesseract-ocr+unsubscr...@googlegroups.com. > To post to this group, send email to tesseract-ocr@googlegroups.com. > Visit this group at https://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/9a488786-ac4d-4d2e-a047-ebe329df1ea8%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/9a488786-ac4d-4d2e-a047-ebe329df1ea8%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAORW5vguXjQnO-c2h9Std0T%2B951Upv3yY_qen65EkAk_EUbHCg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.