JPEG is a bad idea for text data. If you must use it then pre-process it, but it generally does not preserve a clean character outline. It is designed for photographs. PNG or TIFF, but beware that TIFF is just a wrapper, so sometimes it has a JPEG inside. You need a lossless pixel-focused format. --Sven
On Fri, Dec 14, 2012 at 9:10 AM, occorled <[email protected]> wrote: > Thank you, I will do that for b.jpg. > > But like I said, both of those images have the same .dpi value in the > file, yet a.tiff OCRs perfectly and b.jpg is horrible. So I'm not sure > which algorithm I would employ at runtime to determine if I should up-scale > an image or not. It seems you can't simply rely on the exif data. Not > sure what the best approach is... > > > > On Thursday, December 13, 2012 8:32:04 PM UTC-5, Quan Nguyen wrote: >> >> Width and height are image dimensions but are incorrectly labeled as >> resolution in some applications. Since your images are 96 DPI, tripling >> their resolution should work better. >> >> On Wednesday, December 12, 2012 8:26:51 AM UTC-6, occorled wrote: >>> >>> I was always confused about DPI when it comes to images (versus print). >>> I thought, it's all about (w x h) resolution, not DPI, right? I found this >>> page to be informative (and funny) http://www.dpiphoto.eu/dpi.htm**. >>> >>> So basically, I simply scale the image larger right? Perhaps double or >>> triple the resolution of "b.jpg", right? >>> >>> On Tuesday, December 11, 2012 10:12:05 PM UTC-5, Quan Nguyen wrote: >>>> >>>> Rescaling to 300 DPI will produce much better results for the images. >>>> >>> -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en > -- ``All that is gold does not glitter, not all those who wander are lost; the old that is strong does not wither, deep roots are not reached by the frost. >From the ashes a fire shall be woken, a light from the shadows shall spring; renewed shall be blade that was broken, the crownless again shall be king.” -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

