Re: How exactly tesseract works

2011-10-28 Thread Navin Math
thanks. Currently I am using images containing English text, I have placed only eng.traineddata file at the specified location, do i need to place any other files at same location for the tesseract tool. On Fri, Oct 28, 2011 at 8:06 PM, Sven Pedersen wrote: > Hi Navin, > Usually documents scann

Re: Pictures with numbers

2011-10-28 Thread Dmitri Silaev
Hehe, don't be so quick in your conclusions :) It does not work like this off-the-shelf, but it's open-source and that's its merit: you can develop a pre- or postprocessor, or even modify Tess's code to solve your own task. Even when dealing with images of variable structure. Warm regards, Dmitri

Re: Pleasae help me

2011-10-28 Thread Sven Pedersen
Hi Bui Van, Have you read the documents on the website? What font do you need? --Sven On Thu, Oct 27, 2011 at 9:15 PM, bui van Chuong wrote: > Hello all, > > I want to train more font for eng.traineddata. Can I do it and How?? > > sorry for my basic question, but I have no information about it. >

Re: Pictures with numbers

2011-10-28 Thread Joao Henriques
Hi everybody, thanks for the information. Unfortunately the images do not have a fixed structure which can be analysed properly by tesseract. I guess that it is not possible. My conclusion is that tesseract is great for doing OCR of text-images, but doesn't work for images with text :) Thanks for

Re: How exactly tesseract works

2011-10-28 Thread Sven Pedersen
Hi Navin, Usually documents scanned at 90dpi will do poorly, but what really matters is the font size. Typical 10-14 point font documents should be scanned at 200 - 300 dpi for best results. For training questions, you'll need to tell us more about whether the language and domain within the languag

Re: Choice Iterator

2011-10-28 Thread Sven Pedersen
http://groups.google.com/group/tesseract-ocr/topics On Fri, Oct 28, 2011 at 1:18 AM, merve t wrote: > How can i search archives? I am googling but there is no result. I have > found one result, communicate with the author, tried it but there is still > no result. Thanks i am going to ask it to s

Re: Double or single Digit detection

2011-10-28 Thread zdenko podobny
'-psm 8' works for me On Fri, Oct 28, 2011 at 1:37 PM, Diez B. Roggisch wrote: > Hi all, > > I'm trying to detect page-numbers in an otherwise empty book. Using > the OpenCV I can extract the page-number, align it, and threshold a > good B/W picture out of it. > > However, when running tesseract

How exactly tesseract works

2011-10-28 Thread navin
Hi I have two images of same DPI ex: 90 dpi. I used tesseract tool to extract the strings from both images: First image ---> almost 90% of the strings are properly recognized from the image, Second image---> 0%, no strings are recognized properly. I wanted to study why it is failing here? To impro

Double or single Digit detection

2011-10-28 Thread Diez B. Roggisch
Hi all, I'm trying to detect page-numbers in an otherwise empty book. Using the OpenCV I can extract the page-number, align it, and threshold a good B/W picture out of it. However, when running tesseract (commandline so far) over it, no digits are detected. But then taking the image into GIMP, a

Re: Choice Iterator

2011-10-28 Thread merve t
How can i search archives? I am googling but there is no result. I have found one result, communicate with the author, tried it but there is still no result. Thanks i am going to ask it to so. 2011/10/27 Sven Pedersen > Hi Merve, > You'll find a few mentions of it in the past few months' archive

Re: From the ReadMe - "The dll isn't supported in Tesseract-OCR 3.00"

2011-10-28 Thread Slavko Kocjancic
Just simple question... Why dll is removed from 3.00 at all? As I'm not c programmer but wan't to use tesseract I like dll. Now I must do job with commandline only and that's is wery limited. thanks. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr"