-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Loren Wilton wrote:
>> Here's the pic in question as original gif (I joined the parts to
>>  make it easier for gocr):
>> http://www.matthias-keller.ch/ocrmail.gif and converted to pnm:
>> http://www.matthias-keller.ch/ocrmail.pnm
>>
>> And here's what   gocr -i ocrmail.pnm   spits out in my case:
>> http://www.matthias-keller.ch/ocrmail.gocr
>
> The only thing your scan got decently was the sans-serif font.  All
>  of the serif font stuff and the italic sans-serif fonts stuff
> turned to garbage.
>
> I'm not quite sure why this should be.  That looks like pretty
> clean text that should be pretty recognizable.  The contrast could
> be a problem, but that 100% accuracy on the one line indicates that
> it probably isn't.  There should be an option to one of the
> programs to do a b/w transform on this. That may help.
>
> I'd look to see if the ocr program has any options on the kinds of
> fonts it recognizes.
>
> Ok.  A little playing around in photoship.  That is all
> anti-aliased fonts. It looks real good in the gif.  If you convert
> it to jpg, or I suspect any other lossy compression at standard
> compression rates, the results are unusable; there just aren't
> enough pixels.
>
> If you keep all of the pixels (doing this on Windows I went
> gif->bmp to import it to photoshop) you have better luck.
>
> However, if you attempt to threshold to b/w at the default 50%
> threshold level the results are unusable.  If you threshold at
> around 170-190 (out of 255), or around 70-75%, then you get much
> better results.
>
> If you can't control the threshold level, you can try taking the
> contrast up.  I set the contrast to 100% and then thresholded.  The
>  results weren't quite as good, but they were numbers that don't
> require experimentation.  A contrast around 90% might have been
> better, but I didn't try that yet.
>
> Loren
>

We traced down the problem now to his gocr version. I ran my gocr over
his pnm file and got very good results compared to him, actually the
same results I got on his gif. So probably something is wrong with his
gocr version because you don't need special arguments (we are using
the same).

Chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFE2btHJQIKXnJyDxURAjnHAJ4zriGQSU4B2Sr/ii+ivMfG3QRMZwCeI/7a
lVOtMTrJPQbVSkrLpt0760g=
=spr2
-----END PGP SIGNATURE-----

Reply via email to