Please see https://github.com/Shreeshrii/tesstrain-ckb
This is for finetune training from script/Arabic, using text and fonts.
You would need to do steps similar to
https://github.com/Shreeshrii/tesstrain-ckb/blob/master/0-setup.sh
shree,
can you please help me out how to perform arabic training on tesseract 4.
thank you
On Thursday, May 4, 2017 at 3:22:42 PM UTC+5:30, shree wrote:
>
> Ibr,
>
> You are incorrect in your description of LSTM training.
>
> What you are doing will use the ara.traineddata provided in the repo,
replied to it
On Thursday, May 4, 2017 at 3:06:34 PM UTC+3, Ahmad Moawad wrote:
>
> check ur email
>
> On Thursday, May 4, 2017 at 1:51:04 PM UTC+2, Ibr wrote:
>>
>> ibr.h...@gmail.com
>>
>> On Thursday, May 4, 2017 at 2:47:12 PM UTC+3, Ahmad Moawad wrote:
>>>
>>> Ibr give me your email!
>>>
>>>
ibr.ham...@gmail.com
On Thursday, May 4, 2017 at 2:47:12 PM UTC+3, Ahmad Moawad wrote:
>
> Ibr give me your email!
>
> On Thursday, May 4, 2017 at 1:06:22 PM UTC+2, Ibr wrote:
>>
>> while I was creating lstmf files to I can use them in recognition text
>> images I fount that some of the
ibr.ham...@gmail.com
On Thursday, May 4, 2017 at 2:47:12 PM UTC+3, Ahmad Moawad wrote:
>
> Ibr give me your email!
>
> On Thursday, May 4, 2017 at 1:06:22 PM UTC+2, Ibr wrote:
>>
>> while I was creating lstmf files to I can use them in recognition text
>> images I fount that some of the
i shree
actually I saw the section that was talking about lstmtraining, but I what
I said was the result of following the tesseract messages, what happened
from the beginning was that I used to train .traineddata files for English,
and worked fine, but for Arabic it was failing, so I saw the
check ur email
On Thursday, May 4, 2017 at 1:51:04 PM UTC+2, Ibr wrote:
>
> ibr.h...@gmail.com
>
> On Thursday, May 4, 2017 at 2:47:12 PM UTC+3, Ahmad Moawad wrote:
>>
>> Ibr give me your email!
>>
>> On Thursday, May 4, 2017 at 1:06:22 PM UTC+2, Ibr wrote:
>>>
>>> while I was creating lstmf
ibr.ham...@gmail.com
On Thursday, May 4, 2017 at 2:47:12 PM UTC+3, Ahmad Moawad wrote:
>
> Ibr give me your email!
>
> On Thursday, May 4, 2017 at 1:06:22 PM UTC+2, Ibr wrote:
>>
>> while I was creating lstmf files to I can use them in recognition text
>> images I fount that some of the
Ibr give me your email!
On Thursday, May 4, 2017 at 1:06:22 PM UTC+2, Ibr wrote:
>
> while I was creating lstmf files to I can use them in recognition text
> images I fount that some of the characters are recognized in a wrong way,
> some of them are not integrated in the tesseract and some
while I was creating lstmf files to I can use them in recognition text
images I fount that some of the characters are recognized in a wrong way,
some of them are not integrated in the tesseract and some them are due to
some writing in certain Arabic itself,
in this case the tesseract acts
for jTessBoxEditor 2.0 I tried it, but I didn't get any result !!
for your question How much training set is sufficient to have best results
for a new font e.g how many tiff pages.
I think this was mention in Wiki:
https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00
On
Ibr,
You are incorrect in your description of LSTM training.
What you are doing will use the ara.traineddata provided in the repo, there
will be no change in output.
Once lstmf files are created, you have to run lstmtraining which will run
for days/weeks to give you a good result.
Please read
My Scenario is related to make training from images not from text base, I
want to finetune characters such as:
لمجرد not ملجرد
and soon on
On Thursday, May 4, 2017 at 11:28:13 AM UTC+2, Ibr wrote:
>
> if you are referring to tesseract 4.00alpha with liptonica 1.74.1, and if
> you compiled
if you are referring to tesseract 4.00alpha with liptonica 1.74.1, and if
you compiled them in the correct way and got the binaries that you need for
training lmstf files, then I recommend to follow the suggestions that is
made by tesseract devs which is: once you create an .lstmf file for a
>
> I think jTessBoxEditor 2.0 has been updated to include Tesseract 4.00dev.
>>
>
> 1- Could any body confirm because I am not getting better results for
>> Arabic using it.
>>
>
2- How much training set is sufficient to have best results for a new font
e.g how many tiff pages.
--
You
jTessBoxEditor 2.0 beta versions bundle the latest Tesseract 4.00alpha
training executable. The training process for 4.00, however, has not been
integrated to the program. The 3.0x training process is still supported.
Check out the two videos that depict the 3.0x training process:
16 matches
Mail list logo