I'd like to perform OCR on subimages that I am loading using OpenCV.
How can I convert this data into a form tesseract can work with?

So far I tried two different codes, but no version is working. I hope
someone here can help me finding a solution.

A.) converting cv::Mat into Pix*

cv::Mat image = cv::imread("c:/image.png");
 cv::Mat subImage = image(cv::Rect(50, 200, 300, 100));

 int depth;
 if(subImage.depth() == CV_8U)
    depth = 8;
 //other cases not considered yet

 PIX* pix = pixCreateHeader(subImage.size().width,
subImage.size().height, depth);
 pix->data = (l_uint32*) subImage.data;

 tesseract::TessBaseAPI tess;
 STRING text;
 if(tess.ProcessPage(pix, 0, 0, &text))
 {
    std::cout << text.string();
 }

OCR returns only non-readable characters however.

B.) Using cv::Mat for TesseractRect()

cv::Mat image = cv::imread("c:/image.png");
cv::Mat subImage = image(cv::Rect(50, 200, 300, 100));
char* cr = tess.TesseractRect(
           subImage.data,
           subImage.channels(),
           subImage.channels() * subImage.size().width,
           0,
           0,
           subImage.size().width,
           subImage.size().height);

This code doesn't either and also returns only non-readable
characters, although different ones than from the code above.

Does anyone know what the problem could be? cv::Mat stores pixel data
as an array of type uchar, so it should be fine to use in
TesseractRect without any conversion, as UINT8* are required.

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to