Documentation for the internals of Tesseract is unfortunately rather minimal, indeed. I'd recommend you take a look at the TableFinder class in the code to figure it out. And please do share anything you learn here!
Nick On Mon, Apr 07, 2014 at 02:45:51AM -0700, ANBU J wrote: > It's sad that we couldn't find a documentation for the methods for table > manipulation in tesseract. Looks like I have to manually implement an > algorithm > to handle tables. > if you have done it already, please share the knowledge. > > On Tuesday, 25 June 2013 14:42:46 UTC+5:30, [email protected] wrote: > > Hi ! > > I'm going to work for a program which can recognize the table structure > and > text in this table. > I tried to OCR the table image using command line on Windows 7, but the > output text was so bad. > > (just like this: tesseract table.jpg out -l eng, or with "hocr") > I tried to using TessBaseAPI in VC too.(just a simple application) > > The table lines(especially column) interfere in the whole image. > > And now, I find the Class "TableFinder" in Tesseract source code, but I > can't get anything else from Internet. (Tesseract-OCR-3.02) > No demos, teachings here? > > I am new, sincerely hope to get some help. :) > > Thanks! > > -- > -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en > > --- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email > to [email protected]. > For more options, visit https://groups.google.com/d/optout. -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.

