Could you give us a link to where the text of this article can be
downloaded from? Can't find it anywhere, only the title and authors.


On Thu, Mar 31, 2011 at 6:09 AM, Cong Nguyen <[email protected]> wrote:
> Please refer to "OPTIMIZING SPEED FOR ADAPTIVE LOCAL THRESHOLDING ALGORITHM
> USING DYNAMIC PROGRAMMING".
> Complexity is: O(n), n is number of pixels.
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]]
> On Behalf Of Max Cantor
> Sent: Thursday, March 31, 2011 7:28 AM
> To: [email protected]
> Cc: [email protected]
> Subject: Re: tips for improving Tesseract accuracy and speed...
>
> Yes. I've had great experience with sauvola binarize from leptonica. Gamer
> works too but is much much slower
>
> On Mar 31, 2011, at 0:02, cong nguyenba <[email protected]> wrote:
>
>> I have another approach for you here: try to apply binarization using
>> adaptive threshold! Delving into engine by following apdaptive
>> classification in source code for speedup! I think it is enough for
>> your expectation!
>>
>> On Wednesday, March 30, 2011, Dmitri Silaev <[email protected]> wrote:
>>> P.S.: If you're still sure that reasonable downscaling of your images
>>> sacrifices the accuracy, please share one or two of your *unprocessed*
>>> images to investigate further.
>>>
>>> And I'd suggest to keep up with the latest revisions of Tesseract. The
>>> API changes significantly, but Tess is definitely being improved in
>>> the sense of stability, new capabilities and also code efficiency,
>>> which explicitly may lead to improved performance which you are
>>> looking for.
>>>
>>> Warm regards,
>>> Dmitri Silaev
>>>
>>>
>>>
>>>
>>>
>>> On Tue, Mar 29, 2011 at 8:17 AM, Andres <[email protected]> wrote:
>>>> ...required.
>>>>
>>>> Hello people,
>>>>
>>>> I'm develping a licence plate recognition system from long ago and I
> still
>>>> have to improve the use of Tesseract to make it usable.
>>>>
>>>> My first concern is about speed:
>>>> After extracting the licence plate image, I get an image like this:
>>>>
>>>>
> https://docs.google.com/leaf?id=0BxkuvS_LuBAzNmRkODhkYTUtNjcyYS00Nzg5LWE0ZDI
> tNWM4YjRkYzhjYTFh&hl=en&authkey=CP-6tsgP
>>>>
>>>> As you may see, there are only 6 characters (tess is recognizing more
>>>> because there are some blemishes over there, but I get rid of them with
> some
>>>> postprocessing of the layout of the recognized chars)
>>>>
>>>> In an Intel I7 720 (good power, but using a single thread) the tesseract
>>>> part is taking something like 230 ms. This is too much time for what I
> need.
>>>>
>>>> The image is 500 x 117 pixels. I noted that when I reduce the size of
> this
>>>> image the detection time is reduced in proportion with the image area,
> which
>>>> makes good sense. But the accuracy of the OCR is poor when the
> characters
>>>> height is below 90 pixels.
>>>>
>>>> So, I assume that there is a problem with the way I trained tesseract.
>>>>
>>>> Because the characters in the plates are assorted (3 alphanumeric, 3
>>>> numeric) I trained it with just a single image with all the letters in
> the
>>>> alphabet. I saw that you suggest large training but I imagine that that
>>>> doesn't apply here where the characters are not organized in words. Am I
>>>> correct with this ?
>>>>
>>>> So, for you to see, this is the image with what I trained Tesseract:
>>>>
>>>>
> https://docs.google.com/viewer?a=v&pid=explorer&chrome=true&srcid=0BxkuvS_Lu
> BAzODc1YjIxNWUtNzIxMS00Yjg3LTljMDctNDkyZGIxZWM4YWVm&hl=en&authkey=CMXwo-AL
>>>>
>>>> In this image the characters are about 55 pixels height.
>>>>
>>>> Then, for frequent_word_list and words_list I included a single entry
> for
>>>> each character, I mean, something starting with this:
>>>>
>>>> A
>>>> B
>>>> C
>>>> D
>>>> ...
>>>>
>>>> Do you see something to be improved on what I did ? Should I perhaps use
> a
>>>> training image with more letters, with more combinations ? Will that
> help
>>>> somehow ?
>>>>
>>>> Should I include in the same image a copy the same character set but
> with
>>>> smaller size ? In that way, will I be able to pass Tesseract smaller
> images
>>>> and get more speed without sacrificing detection quality ?
>>>>
>>>>
>>>> On the other hand, I found some strange behavior of Tesseract about
> which I
>>>> would like to know a little more:
>>>> In my preprocessing I tried Otsu thresholding
>>>> (http://en.wikipedia.org/wiki/Otsu%27s_method) and I visually got too
> much
>>>> better results, but surprisingly for Tesseract it was worse. It
> decreased
>>>> the thickness of the draw of the chars, and the chars I used to train
>>>> Tesseract were bolder. So, Tesseract matches the "boldness" of the
>>>> characters ? Should I train Tesseract with different levels of boldness
> ?
>>>>
>>>> I'm using Tesseract 2.04 for this. Do you think that some of these
> issues
>>>> will go better by using Tess 3.0 ?
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Andres
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
> Groups
>>>> "tesseract-ocr" group.
>>>> To post to this group, send email to [email protected].
>>>> To unsubscribe from this group, send email to
>>>> [email protected].
>>>> For more options, visit this group at
>>>> http://groups.google.com/group/tesseract-ocr?hl=en.
>>>>
>>>
>>> --
>>> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
>>> To post to this group, send email to
>>
>> --
>> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
>> To post to this group, send email to [email protected].
>> To unsubscribe from this group, send email to
> [email protected].
>> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en.
>>
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected].
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en.
>
> --
> You received this message because you are subscribed to the Google Groups 
> "tesseract-ocr" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to 
> [email protected].
> For more options, visit this group at 
> http://groups.google.com/group/tesseract-ocr?hl=en.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.

Reply via email to