[tesseract-ocr] tesseract via gosseract returns empty text for one image, but CLI detects correctly ("NO SMOKING")

Harshit Goel Fri, 31 Oct 2025 10:44:59 -0700


Hi team

I’m facing an issue where Tesseract OCR works correctly from the CLI, but
returns an empty string when called programmatically using Go (via
gosseract).

For this particular image:
https://pmi-api.ubconnex.ca/files/icons/2025-03/11c6051eec503f52c43f0de382980d31.png,

the OCR always returns an empty string when running programmatically. Yet
when I run the exact same image manually using Tesseract from terminal by
command: *tesseract /tmp/ocr-3678469497.png stdout*

It correctly detects and returns *NO SMOKING*

*Environment*

- OS: Linux (Server)
-

Tesseract version: tesseract 5.x (CLI works fine)
-

Go binding: github.com/otiai10/gosseract/v2
-

Go version: go1.23.x

I've tried with the following approaches but still no effect:

Different PSM modes (SPARSE_TEXT, SINGLE_BLOCK, etc.)
-

Preprocessing (grayscale, contrast enhancement, flattening transparency).
-

Verified that the image file is saved correctly and readable by
Tesseract.
-

Tried increasing image size and contrast.

Is there any known discrepancy between the CLI binary and the gosseract API
in how page segmentation modes or image preprocessing are handled
internally?

Any insight on why Tesseract detects text in CLI but gosseract binding
returns empty output would be very helpful.

Best Regards,

Harshit Goel

--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion visit
https://groups.google.com/d/msgid/tesseract-ocr/54875e13-9f91-4f45-9eb8-ee8eec4e5846n%40googlegroups.com.

[tesseract-ocr] tesseract via gosseract returns empty text for one image, but CLI detects correctly ("NO SMOKING")

Reply via email to