Know the resolution, and headers, where the issue for Tesseract OCR PHP the
following (should help) for anyone in future looking for a solution:
1. Create your imagick instance, ie $image -> new Imagick('image.jpg');
2. Then set the resolution using two lines,
first: setImageUnits(imagick::RESOLUTION_PIXELSPERINCH);
then setImageResolution(300,300);
3. The resolution is then set ready for tesseract to read.
I hope that helps.
On Thursday, 30 April 2020 15:29:07 UTC+1, tristan gordon wrote:
>
> Thank you.
> Now to look at imagick to set the resolution!
>
> On Thursday, 30 April 2020 10:36:56 UTC+1, shree wrote:
>>
>> Looks like the image resolution is not set correctly. You can specify dpi
>> while processing.
>>
>> ubuntu@tesseract-ocr:~/TEST$ tesseract 82.png - --dpi 300
>> 82
>> ubuntu@tesseract-ocr:~/TEST$ tesseract 81.png - --dpi 300
>> 81
>>
>>
>> On Thu, Apr 30, 2020 at 2:57 PM tristan gordon <[email protected]>
>> wrote:
>>
>>> Hello all,
>>>
>>> Could you help?
>>>
>>> Attached are two images containing two numbers, 81 and 82, which I am
>>> attempting to get Tesseract OCR to read.
>>>
>>> Each time Tesseract OCR is returning empty page and producing an empty
>>> text.txt document.
>>>
>>> The error is displaying as follows:
>>>
>>> # tesseract 82.png out
>>> Tesseract Open Source OCR Engine v4.1.1-rc2-20-g01fb with Leptonica
>>> Warning: Invalid resolution 0 dpi. Using 70 instead.
>>> Estimating resolution as 1622
>>> Empty page!!
>>> Estimating resolution as 1622
>>> Empty page!!
>>>
>>> How can I get the numbers to output? Are any changed required to the
>>> images or to tesseract?
>>>
>>> These images have been produced using Centos 7, Apache, PHP and Imagick.
>>> Retrieving the image from an external server, then processing the image
>>> using Imagick to crop, grayscale, trim to focus area, resize, smooth edges,
>>> remove background, set image to black and white, flatten the image, set a
>>> resolution and image format.
>>> These images have then been saved (for development purposes) and tested
>>> using the above.
>>>
>>> Once these errors are sorted and it's running, tesseract-ocr-php will
>>> complete the process on the fly (as there's around 6000 images to read).
>>>
>>> Let me know.
>>>
>>> Thank you (in advance).
>>>
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/tesseract-ocr/2314316b-1b5c-4a44-b9bb-8e65a901a688%40googlegroups.com
>>>
>>> <https://groups.google.com/d/msgid/tesseract-ocr/2314316b-1b5c-4a44-b9bb-8e65a901a688%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>
>>
>> --
>>
>> ____________________________________________________________
>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>>
>
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/tesseract-ocr/4d5b7781-57ed-4485-858a-15af1caa0b4b%40googlegroups.com.