I suspect, this paper is a sledgehammer for a nut. It's quite
universal and elaborated. Usually it may take a great deal of time to
implement and debug it. Your images might require much simplier
methods.

I always say the same thing: send your sample images and the community
will try to help.

Warm regards,
Dmitry Silaev





On Mon, Mar 14, 2011 at 8:23 AM, David Hoffer <dhoff...@gmail.com> wrote:
> Hi Vicky,
>
> Can you tell me more about this paper?  It looks like this is not a
> free document so I can't just read it to see if it would solve the
> problem I have.
>
> My problem is that I have grey-scale image data (tif/jpg/etc) that
> contains text within a table format, i.e. cells on the page.  The
> documents where originally faxed then converted to PDF so the image
> quality varies from poor to good.  I don't want the table formatting,
> I'm looking for a way to remove the formatting and get to just the
> image text, I want to convert that to text using OCR, Tesseract or
> otherwise.
>
> My programming environment is Java but can shell out to other programs
> if I need to.
>
> Would the approach in the paper solve this problem space?  How
> practical is the software solution for a one man effort?
>
> Thanks,
> -Dave
>
>
>
> On Sun, Mar 13, 2011 at 10:18 AM, Vicky Budhiraja <vicky.vi...@gmail.com> 
> wrote:
>> Hello,
>>
>> I used this paper (for pre-processing):
>> Parameter-Free Geometric Document Layout Analysis, by Lee, Ryu 2001. IEEE
>> Tran. Patt. Analysis and Machine Int. Nov 2001 Volume 23 Issue 11 Pages 1240
>> - 1256
>>
>> Best Regards,
>> Vicky
>>
>>
>>
>> -----Original Message-----
>> From: tesseract-ocr@googlegroups.com [mailto:tesseract-ocr@googlegroups.com]
>> On Behalf Of Daphne
>> Sent: Friday, March 11, 2011 01:15
>> To: tesseract-ocr
>> Subject: how to get the character in an image file which is in table format.
>>
>> Hello,
>>
>> I have a scanned image file which contains table. When I OCR it using
>> tessnet it doesn't give the desired output.
>> It is not reading the characters in the table. Instead it give some
>> numbers.
>>
>> How to read the character in table format image
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "tesseract-ocr" group.
>> To post to this group, send email to tesseract-ocr@googlegroups.com.
>> To unsubscribe from this group, send email to
>> tesseract-ocr+unsubscr...@googlegroups.com.
>> For more options, visit this group at
>> http://groups.google.com/group/tesseract-ocr?hl=en.
>>
>> --
>> You received this message because you are subscribed to the Google Groups 
>> "tesseract-ocr" group.
>> To post to this group, send email to tesseract-ocr@googlegroups.com.
>> To unsubscribe from this group, send email to 
>> tesseract-ocr+unsubscr...@googlegroups.com.
>> For more options, visit this group at 
>> http://groups.google.com/group/tesseract-ocr?hl=en.
>>
>>
>
> --
> You received this message because you are subscribed to the Google Groups 
> "tesseract-ocr" group.
> To post to this group, send email to tesseract-ocr@googlegroups.com.
> To unsubscribe from this group, send email to 
> tesseract-ocr+unsubscr...@googlegroups.com.
> For more options, visit this group at 
> http://groups.google.com/group/tesseract-ocr?hl=en.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to tesseract-ocr@googlegroups.com.
To unsubscribe from this group, send email to 
tesseract-ocr+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.

Reply via email to