Manuel,

The sample you provided definitely has insufficient resolution. You
may only expect some part of the heading to be recognized. So this is
what happened when I've run the recognition of your image. But I
haven't got any error or warning messages with my "por.traineddata" at
all!

However all this was tested under Windows. Probably I can try this
under Ubuntu, but I don't know when I have enough time to reboot, set
up a C++ compiler, build Tesseract and do some testing, sorry ))

Are you sure you downloaded the latest stable version of Tesseract?

Warm regards,
Dmitry Silaev





On Thu, Mar 10, 2011 at 9:32 PM, manuel...@gmail.com
<manuel...@gmail.com> wrote:
> I just replaced por.traineddata with your file por.traineddata.
> After that I'm getting this message error:
>
>>> manuel$ tesseract input.tiff output -l por
>>> actual_tessdata_num_entries_ <= TESSDATA_NUM_ENTRIES:Error:Assert failed:in 
>>> file tessdatamanager.cpp, line 55
>>> Segmentation fault
>
> I haven't succeeded. I'm using version 3 - MacOSX 10.6
>
>
>
> Attached Reported.tiff
>
>
>
>
>
>
> Regards
> Manuel Pardo
>
> Em 04/03/2011, às 03:19, Dmitry Silaev escreveu:
>
>> Manuel,
>>
>> Is the error message generated by version 2.xx? Did you try to run
>> version 3.xx with my "por.traineddata" file?
>> I don't get it - have you succeeded or not?
>> Please provide us with the image you are trying to recognize.
>>
>> Warm regards,
>> Dmitry Silaev
>>
>>
>>
>>
>>
>> On Thu, Mar 3, 2011 at 5:34 PM, manuel...@gmail.com <manuel...@gmail.com> 
>> wrote:
>>> Hi Dmitry,
>>>
>>> I just replaced with your file por.traineddata
>>> But I'm getting an error:
>>>
>>> manuel$ tesseract input.tiff output -l por
>>> actual_tessdata_num_entries_ <= TESSDATA_NUM_ENTRIES:Error:Assert failed:in 
>>> file tessdatamanager.cpp, line 55
>>> Segmentation fault
>>>
>>> It's seem to be interesting to convert old files from 2.0X to 3, because 
>>> there isn't a brazillian portuguese for version 3,  just "portuguese".
>>> At least the dictionary por.traineeddata is working correctly in version 3.
>>> The special chars is being recognized by tesseract 3.
>>>
>>> regards,
>>> Manuel Pardo
>>>
>>>
>>>
>>>
>>> Em 03/03/2011, às 09:12, Dmitry Silaev escreveu:
>>>
>>>> Manuel,
>>>>
>>>> It's quite an interesting question although it may seem to be an
>>>> ordinary newbie-like one.
>>>>
>>>> I was always wondering if 2.xx files can be used with version 3.xx.
>>>> The wiki states that "the files in the traineddata file are different
>>>> from the list used prior to 3.00, and will most likely change,
>>>> possibly dramatically in future revisions."
>>>>
>>>> I have no time to investigate it in the code so I decided to act
>>>> rather than to think. After some tinkering with all those files I
>>>> slipped the resulted "por.traineddata" into my Tesseract algo I'm
>>>> currently working at, and - guess what? - it worked! ))
>>>>
>>>> I must say it was tested only with a couple of *very simple* images
>>>> and also it absolutely lacks any dictionary-related data. And my test
>>>> images don't contain these specific Portuguese letters with
>>>> diacritics. So in fact this file may perform poorly. Please test and
>>>> report your results. The file is in the attachment.
>>>>
>>>> It was not difficult at all but also not so straight-forward to make
>>>> this training data file, so probably this process deserves a separate
>>>> article and later I'd like to post it in my blog.
>>>>
>>>> Warm regards,
>>>> Dmitry Silaev
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Mar 2, 2011 at 8:40 PM, manuelfhp <manuel...@gmail.com> wrote:
>>>>> Helo list,
>>>>> I can't find a solution for special chars
>>>>>
>>>>> I installed tesseract 3 in my MacOSX 10.6
>>>>> It is running very well
>>>>>
>>>>> But I'm having problems with charset.
>>>>> I need tesseract working with brazillian portuguese. (ISO8859-1)
>>>>>
>>>>> I installed the portuguese dictionary but is not working with special
>>>>> chars like  Ç Ã É é ....  (ISO8859-1)
>>>>> Is there any solution ?
>>>>>
>>>>> There is an old dictionary special for brazilian portuguese in version
>>>>> 2.0.4. Is it possible to use in version 3? How?
>>>>>
>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google Groups 
>>>>> "tesseract-ocr" group.
>>>>> To post to this group, send email to tesseract-ocr@googlegroups.com.
>>>>> To unsubscribe from this group, send email to 
>>>>> tesseract-ocr+unsubscr...@googlegroups.com.
>>>>> For more options, visit this group at 
>>>>> http://groups.google.com/group/tesseract-ocr?hl=en.
>>>>>
>>>>>
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google Groups 
>>>> "tesseract-ocr" group.
>>>> To post to this group, send email to tesseract-ocr@googlegroups.com.
>>>> To unsubscribe from this group, send email to 
>>>> tesseract-ocr+unsubscr...@googlegroups.com.
>>>> For more options, visit this group at 
>>>> http://groups.google.com/group/tesseract-ocr?hl=en.
>>>>
>>>> <por.traineddata>
>>>
>>> --
>>> You received this message because you are subscribed to the Google Groups 
>>> "tesseract-ocr" group.
>>> To post to this group, send email to tesseract-ocr@googlegroups.com.
>>> To unsubscribe from this group, send email to 
>>> tesseract-ocr+unsubscr...@googlegroups.com.
>>> For more options, visit this group at 
>>> http://groups.google.com/group/tesseract-ocr?hl=en.
>>>
>>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups 
>> "tesseract-ocr" group.
>> To post to this group, send email to tesseract-ocr@googlegroups.com.
>> To unsubscribe from this group, send email to 
>> tesseract-ocr+unsubscr...@googlegroups.com.
>> For more options, visit this group at 
>> http://groups.google.com/group/tesseract-ocr?hl=en.
>>
>
>
> --
> You received this message because you are subscribed to the Google Groups 
> "tesseract-ocr" group.
> To post to this group, send email to tesseract-ocr@googlegroups.com.
> To unsubscribe from this group, send email to 
> tesseract-ocr+unsubscr...@googlegroups.com.
> For more options, visit this group at 
> http://groups.google.com/group/tesseract-ocr?hl=en.
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to tesseract-ocr@googlegroups.com.
To unsubscribe from this group, send email to 
tesseract-ocr+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.

Reply via email to