Re: Tesseract and old fonts

2011-01-18 Thread daemon-s
Dear Andrew, I've a couple of observations on your problem. - The "standard" English language file was created using the set of training images of the famous computer fonts like Arial, Times, Verdana, some Ghostscript fonts and of their italic and bold versions. Your book document's characters ha

Tesseract and old fonts

2011-01-18 Thread daemon-s
*** On behalf of Andy Syme who could not post in this group probably due to spam removal artefacts *** ...my problem is that I have some documents written in 1890-1920 that I scanned & want to OCR. They are in English & using the standard English language file I was getting 40-50% recognition. I