> As an added step, you could might consider: rendering to grayscale,
> slightly blurring (optional), adding a bit of noise, and then
> re-converting to b&w to simulate what physical scanners do?  Maybe do
> this at 1200dpi and also downsample to 300 dpi.

I wouldn't have thought adding random noise would be helpful; it
will just distort the shapes which Tesseract will use to match, and
as it will always get different noise to the type I generated, it
would only hinder it further. At least that's what I had assumed. Am
I wrong about this? Has anybody tested whether adding random noise
to an otherwise clean training improves things?

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to