Dear All, Again I update this thread for help. Could I get the table cells based on the table finder results?
在 2012年6月19日星期二UTC+8下午4时26分33秒,Neo Song写道: > > Dear All, > > Currently I am doing a table text extraction project, and we need to > identify the table before any OCR process. > I investigate the related source code (checked out version:r729), and > found the there is a table finder class inside tesseract (tablefind.cpp). > The problem is that for the irregular tables(e.g. different rows have > different columns), even if I got all the ruling lines, I can not identify > the concrete table cells. > I have called the function "FindLinesCreateBlockList()" and I can > iterate all the text block, horizontal lines and vertical lines in the > target image. However I can do nothing with these horizontal lines and > vertical lines, what I need is something like a CELL_LIST, which contains > every table cell in a reading order based on table ruling lines. I believe > that the table finder may already contain such a algorithm(I read the code > but it is too much complicated), but not exposed to Base API interface. Is > it true? > Can someone help me out of this? How to obtain the table cells? An > example of such irregular table can be found in the attachment. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to tesseract-ocr@googlegroups.com To unsubscribe from this group, send email to tesseract-ocr+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en