word detection engine

Amit Man Wed, 01 Jul 2020 11:46:07 -0700

I couldn't find there anything to do with improving words detection.
Am I missing something?



On Wednesday, July 1, 2020 at 12:45:36 PM UTC+3, zdenop wrote:
>
> Try this:
>
> https://github.com/Sintun/PersonalHelperPrograms/blob/master/Tesseract/tess.cpp
>
> Longer story:
> https://github.com/tesseract-ocr/tesseract/issues/1714  
>
> Zdenko
>
>
> st 1. 7. 2020 o 10:29 amit...@gmail.com <ami...@gmail.com <javascript:>> 
> napísal(a):
>
>> I want to optimise tesseract 4 (lstm) for a set of documents I have.
>> I managed to improve its character recognition using the documentation in 
>> https://tesseract-ocr.github.io/tessdoc/TrainingTesseract-4.00.
>>
>> However, some words are not just detected. usually words inside tables. 
>> Even using --psm 6, some are missed.
>>
>> Is there a way to train the layout/segmentation/word detection engine and 
>> not just the character recognition?
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to tesser...@googlegroups.com <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/tesseract-ocr/e210bfe2-563a-48a5-b0bc-5363c7269bcfn%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/tesseract-ocr/e210bfe2-563a-48a5-b0bc-5363c7269bcfn%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/e4b9d33c-efc9-4a36-8c85-d2a14e4c1692o%40googlegroups.com.

Re: [tesseract-ocr] training the layout/segmentation/word detection engine

Reply via email to