Thanks for the quick response, but I already know about those APIs - let me 
try to explain with an example.

Let's say that ResultIterator says that it found the word "hello" in the 
image at position (100, 100), and TessResultIteratorWordFontAttributes says 
it's in font "Arial" with a height of 16.  In my Windows application, I can 
construct a 16-high Arial font and draw the word "hello" at (100, 100) and 
I am doing a good job of showing the user the OCR output.

But now let's say that ResultIterator continues and says that it found the 
word "goodbye" in the image at position (100, 300), and 
TessResultIteratorWordFontAttributes says it's in font "DejaVu Sans" with a 
height of 16.  If I tell Windows to construct a font named "DejaVu Sans", 
Window won't have any idea what that is, and it will pick some random font 
from its list.  When I then have my Windows application draw the word 
"goodbye" at (100, 300), it's highly likely that the character widths in 
the font that Windows is using are very different from the character widths 
in the actual DejaVu Sans font, so the word "goodbye" will take up the 
wrong amount of space and I'll either end up with lots of white space or 
(more often) the words all run over each other.

Does that make more sense?

Thanks,
Chris


On Friday, September 20, 2013 5:39:07 PM UTC-6, Quan Nguyen wrote:
>
> You'll need to access Tessearct API for such information, specifically, 
> ResultIterator and ResultIteratorWordFontAttributes. Check out the API 
> Example <http://code.google.com/p/tesseract-ocr/wiki/APIExample> page.
>
> Quan
>
>
> On Friday, September 20, 2013 3:42:14 PM UTC-5, [email protected] wrote:
>>
>> I would like to show the user the OCR output in my Windows application in 
>> a graphical form (the OCR'd characters, in the specified font, in the right 
>> location), in order to do that I need to pick a font to draw the OCR output 
>> text in, and it seems like I have two choices -
>> 1) Map the Tesseract font to something Windows can understand
>> 2) Use the actual Tesseract font
>>
>> For #1, Tesseract uses a lot of fonts that I've got on my Windows box 
>> (Times New Roman, Arial, etc.) but then it also comes up with some I don't 
>> have (Century Schoolbook).  Is there a way to enumerate all the names of 
>> the fonts that Tesseract might return?  I can then decide whether it's 
>> easier to find Windows equivalent for all the fonts, or to download fonts 
>> (if they are free and have nice licensing).
>>
>> For #2, it's not enough to just display the selected portion of the 
>> source image, that doesn't tell the user anything.  I would need a way to 
>> ask Tesseract, "what is the glyph for an uppercase G in an Arial font of 
>> height 34".  Does that exist?
>>
>> Thanks,
>> Chris
>>
>>

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to