What command are you using? You should use fra+eng or something similar.
The resolution may be too high -- 600dpi is typically the upper limit (but
really just the letter height matters). You could limit the character set
too...
Sven

On Tuesday, August 6, 2013, wrote:

> Hi,
>
> I am trying to recognize an 18th century text for academic purposes. I
> followed the (very helpful) tutorial, and encountered no technical
> problems. However, the recognition rate is disappointing. I think the
> source material may just be too difficult for tesseract 3 (see sample
> image <http://i.imgur.com/d5RnxI4.png> and recognized text below). The
> difficulties are multiple: 3 fonts, 2 languages (bilingual text), obsolete
> spellings, variable stroke width... I retrained tesseract on 10 samples of
> each character, without much improvement.
>
> Could someone tell me if this is feasible? Or maybe the state of the art
> in OCR has not reached yet this kind of performance...
>
> Thanks for the insight!
>
> Fabrizio
>
> --
>
> Image: http://i.imgur.com/d5RnxI4.png
>
> *Recognized text for image*
>
> ACCOLADE,  [embraffement] A bug, clîppl’ng and
> colling. Je hazardaî quèlques accolades qui ne îûrent pâs
> trop mal reçûes, I ventured ſome bugs, wbicb were not very
> îll receîved. * Nous nous mimes ä domler des accolades â
> notre boutèille, PVc./ëll ta bugging our bottle. ☞ Il l’a fait
> Chevalîér en lui donnant l’accolade, He bar dubbcd hl’ln a
> K.wigbt. ☞ Sèrvîr unc accolade de lapereaûx (une couple)
> To jZ-rve o couple oj’yortng rabbîts în one dffla.
>
>
>  --
> --
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To post to this group, send email to 
> [email protected]<javascript:_e({}, 'cvml', 
> '[email protected]');>
> To unsubscribe from this group, send email to
> [email protected] <javascript:_e({}, 'cvml',
> 'tesseract-ocr%[email protected]');>
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en
>
> ---
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected] <javascript:_e({},
> 'cvml', 'tesseract-ocr%[email protected]');>.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>
>


-- 
``All that is gold does not glitter,
  not all those who wander are lost;
the old that is strong does not wither,
  deep roots are not reached by the frost.
>From the ashes a fire shall be woken,
  a light from the shadows shall spring;
renewed shall be blade that was broken,
  the crownless again shall be king.”

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to