Re: [tesseract-ocr] lt-lstmtraining: genericvector.h:720: T& GenericVector::operator[](int) const [with T = char]: Assertion `index >= 0 && index < size_used_' failed.

2018-11-26 Thread Shree Devi Kumar
If you have the problem with the master version also, please open an issue on github. Please include a stack trace/debug information also. On Mon, 26 Nov 2018, 10:48 Khosrobeigy.zohreh And I have the problem again > *Kind regards,* > *Zohreh Khosrobeygi* > > *Student of IT* > > *University of

[tesseract-ocr] New jpn_vert.trainnedata

2018-11-26 Thread Seokbong Choi
Hello all, Although our jpn_vert from best worked greatly, it didn't serve my purpose - reading comic books. Here, I retrained with the new font and new expressions where most Japanese comic books use. https://github.com/zodiac3539/jpn_vert - Add more fonts - Othutome, the font

Re: [tesseract-ocr] Tesseract v4 generated incorrect text output

2018-11-26 Thread Seokbong Choi
Hello, OEM and PSM are values that you should set up whenever you execute tesseract.exe, which cannot be automatically detected under the current version. (I hope it can be improved in the next version) I guess you are in the situation where the optimal result can be obtained through different

[tesseract-ocr] Analyze output from the OCR tutorial

2018-11-26 Thread Marziye Rahmati
Hello to all Can anyone help me understand the output from the training OCR version 4? for example : What is delta mean ؟ At iteration 3052/5000/5102, Mean rms=0.85%, delta=0.98%, char train=3.846%, word train=5.917%, skip ratio=2.1%, New worst char error = 3.846 wrote checkpoint. -- You

[tesseract-ocr] Re: Extract Header and Footer text separately from document image

2018-11-26 Thread bohdan . moskalevskyi
Same here. I’m surprised this issue isn’t more common. Any solutions? понеділок, 9 квітня 2018 р. 15:43:41 UTC+3 користувач Mohit Jain написав: > > Is there a way to extract the header and footer content on a document page > separately using Tesseract OCR? I tried the hOCR output but it doesn't

Re: [tesseract-ocr] How recognize footnotes

2018-11-26 Thread bohdan . moskalevskyi
hocr doesn’t help see also https://groups.google.com/forum/#!searchin/tesseract-ocr/footer%7Csort:date/tesseract-ocr/YY4jMNmSoTM/KAMTzkc5AQAJ вівторок, 30 травня 2017 р. 17:57:43 UTC+3 користувач shree написав: > > Try the `hocr` output and see if it provides some of what you need. > > I don't

[tesseract-ocr] Re: Handwriting training

2018-11-26 Thread DreadStarX
Afaik, tesseract doesn't do handwriting. I could be mistaken, there's another application that scans handwriting. On Monday, November 26, 2018 at 4:40:48 AM UTC-8, Rob wrote: > > Hello everyone, > > I am currently working on making a scanned fillable text document readable > for the computer.

Re: [tesseract-ocr] lt-lstmtraining: genericvector.h:720: T& GenericVector::operator[](int) const [with T = char]: Assertion `index >= 0 && index < size_used_' failed.

2018-11-26 Thread Khosrobeigy.zohreh
And I have the problem again *Kind regards,* *Zohreh Khosrobeygi* *Student of IT* *University of Tehran, 2016* *Phone: (+98)9196042887* *Email:khosrobeygi.zo...@ut.ac.ir * On Mon, Nov 26, 2018 at 6:45 PM Khosrobeigy.zohreh wrote: > I downloaded tesseract-master from github and reinstall

Re: [tesseract-ocr] lt-lstmtraining: genericvector.h:720: T& GenericVector::operator[](int) const [with T = char]: Assertion `index >= 0 && index < size_used_' failed.

2018-11-26 Thread Khosrobeigy.zohreh
I downloaded tesseract-master from github and reinstall it again but now my version is: tesseract 4.0.0 leptonica-1.76.0 libjpeg 8d (libjpeg-turbo 1.4.2) : libpng 1.2.54 : libtiff 4.0.6 : zlib 1.2.8 Found AVX2 Found AVX Found SSE Is that true? *Kind regards,* *Zohreh Khosrobeygi* *Student

Re: [tesseract-ocr] lt-lstmtraining: genericvector.h:720: T& GenericVector::operator[](int) const [with T = char]: Assertion `index >= 0 && index < size_used_' failed.

2018-11-26 Thread Shree Devi Kumar
Please update to the latest version from github and try. On Mon, 26 Nov 2018, 08:36 Khosrobeigy.zohreh tesseract 4.0.0-beta.4 > leptonica-1.76.0 > libjpeg 8d (libjpeg-turbo 1.4.2) : libpng 1.2.54 : libtiff 4.0.6 : zlib > 1.2.8 > Found AVX2 > Found AVX > Found SSE > > *Kind regards,* >

Re: [tesseract-ocr] lt-lstmtraining: genericvector.h:720: T& GenericVector::operator[](int) const [with T = char]: Assertion `index >= 0 && index < size_used_' failed.

2018-11-26 Thread Khosrobeigy.zohreh
tesseract 4.0.0-beta.4 leptonica-1.76.0 libjpeg 8d (libjpeg-turbo 1.4.2) : libpng 1.2.54 : libtiff 4.0.6 : zlib 1.2.8 Found AVX2 Found AVX Found SSE *Kind regards,* *Zohreh Khosrobeygi* *Student of IT* *University of Tehran, 2016* *Phone: (+98)9196042887*

[tesseract-ocr] Handwriting training

2018-11-26 Thread Rob
Hello everyone, I am currently working on making a scanned fillable text document readable for the computer. This document can be filled in with computer writing as well as with handwriting. The quality of the scanned document is good enough and the font is not too small. I'm sing Ubuntu

Re: [tesseract-ocr] lt-lstmtraining: genericvector.h:720: T& GenericVector::operator[](int) const [with T = char]: Assertion `index >= 0 && index < size_used_' failed.

2018-11-26 Thread Shree Devi Kumar
What is the version of tesseract? tesseract -v On Mon, 26 Nov 2018, 05:51 Zohreh Khosrobeygi Hi, > I have been runnig about 130G data which are 4000 files. My command is > > /home/kddlab/Desktop/tesseract-master/src/training/lstmtraining \ > --traineddata >

[tesseract-ocr] lt-lstmtraining: genericvector.h:720: T& GenericVector::operator[](int) const [with T = char]: Assertion `index >= 0 && index < size_used_' failed.

2018-11-26 Thread Zohreh Khosrobeygi
Hi, I have been runnig about 130G data which are 4000 files. My command is /home/kddlab/Desktop/tesseract-master/src/training/lstmtraining \ --traineddata /home/kddlab/Desktop/tesseract-master/src/training/langdata/fas/fas/fas.traineddata --net_spec