Re: [tesseract-ocr] Installation of Tesseract and some of its dependencies from source on CentOS

2018-10-16 Thread Zdenko Podobny
1. Why you are building debug tesseract? 2. Why you are mixing build tools (cmake for leptonica and autotool for rest)? There was reported issue regarding this mix in case of leptonica->tesseract... 3. jped. png, tiff are common lib heavily used by desktop system. Replacing system

Re: [tesseract-ocr] Making custom traineddata

2018-10-16 Thread Vinod Gattani
Thanks everyone. With suggestions and following this link " https://www.youtube.com/watch?v=WZLJucXZy-g";, I was able to run a demo training for a font. I used Shreeshrii' github repo "https://github.com/Shreeshrii/tessdata_ocrb ". Need some help on below points: If there any documentation avail

[tesseract-ocr] Installation of Tesseract and some of its dependencies from source on CentOS

2018-10-16 Thread Fatih Ertinaz
Hello all I could not find a proper documentation explaining each step explicitly regarding the compilation from source including leptonica and some other image libs. Therefore I created my own scripts and wanted to share them here. Hopefully others might benefit as well. Starting with libpn

Re: [tesseract-ocr] pixRead problem

2018-10-16 Thread Zdenko Podobny
I do not use vcpkg. I suggest you to use cppan (you need to install it and put to path). For me it stupidly easy and it takes cca 15 minutes on my computer and internet network): gir clone https://github.com/tesseract-ocr/tesseract.git cd tesseract mkdir build64 cd build64 cppan.. cmake .. -G "Vi

Re: [tesseract-ocr] train tesseract OCR 4.0

2018-10-16 Thread Shree Devi Kumar
Please do not use tesseract 4.0 alpha. There have been many changes since then. Use the latest code from github, which is 4.0.0-rc3 or install from Alex's PPA or from ub mannheim (for Windows). Please read the wiki pages about training for new font for tesseract 4 - fine tuning for Impact. On Tu

Re: [tesseract-ocr] Multiple Languages

2018-10-16 Thread Shree Devi Kumar
> > Please try with tessdata_fast -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email

RE: [tesseract-ocr] Multiple Languages

2018-10-16 Thread MariamHi
Yes, using best data From: Adrian Owen Sent: Tuesday, October 16, 2018 3:44 PM To: tesseract-ocr@googlegroups.com Subject: RE: [tesseract-ocr] Multiple Languages Are you using the best data: https://github.com/tesseract-ocr/tessdata_best ? From: tesseract-ocr@googlegroups.com [mailto:tesseract-

RE: [tesseract-ocr] Multiple Languages

2018-10-16 Thread Adrian Owen
Are you using the best data: https://github.com/tesseract-ocr/tessdata_best ? From: tesseract-ocr@googlegroups.com [mailto:tesseract-ocr@googlegroups.com] On Behalf Of MariamHi Sent: 16 October 2018 13:19 To: tesseract-ocr@googlegroups.com Subject: RE: [tesseract-ocr] Multiple Languages Result:

Re: [tesseract-ocr] train tesseract OCR 4.0

2018-10-16 Thread kislay bajpai
Hello Shree, I am confused how to train tesseract 4.0 alpha for new font (E 13B). Please help me for it. On Thursday, March 23, 2017 at 5:24:59 PM UTC+5:30, shree wrote: > > To read characters from an image, it is not necessary to train it. Just > use an appropriate traineddata. > > Training i

Re: [tesseract-ocr] pixRead problem

2018-10-16 Thread flaviumarc
Your post are valuable for me, it is first time when I try to use tesseract. Regarding compiling leptonica and tesseract, it's endless story :) I have taken from here: https://github.com/Microsoft/vcpkg vcpkg, and generated the exe from .bat file. And then I have tried this command in console: *v

RE: [tesseract-ocr] Multiple Languages

2018-10-16 Thread MariamHi
Result: SimplifiedArabic 'جوجل 600916 " يستعد لاقتحام أدمغتثا الاحد 9 سبتمبر 20185 الاقتصادية" من الرياض" 0 هل حدث وأن بحثت عن منتج معين عبر الإنترنت وتفاجأت باقتراحات عديدة لاحقا على حسابك في فيسبوك 808500 لشركات توفر منتجات مشابهة؟ الأمر ليس صدفة؛ ولكن ذلك يندرج ضمن استراتيجيات يتبن

[tesseract-ocr] Re: Installation of Tesseract in Silent Mode

2018-10-16 Thread Ketaki Fadnavis
Can someone please help ? How to install tesseract using command line? I want to install tesseract through install4j installer, so will need command line arguments to install it in custom location. On Thursday, 20 September 2018 15:13:02 UTC+5:30, Ketaki Fadnavis wrote: > > Hi, > > I want to in

Re: [tesseract-ocr] Why do I get such poor results from Tesseract for simple single character recognizing?

2018-10-16 Thread 'Yuliana Zigangirova' via tesseract-ocr
Thank you very much, I'll try all suggested changes. I have already tried borders and they seem to work! Yuliana On Tuesday, October 16, 2018 at 9:04:04 AM UTC+3, zdenop wrote: > > >1. If you have quality problem - it good to play with tesseract >executable instead of API ;-) >2. I

RE: [tesseract-ocr] Multiple Languages

2018-10-16 Thread Adrian Owen
Did AUTO improve results? Work from uncompressed e.g. png Try this resize: using (Bitmap large = ResizeImage(png, (int)(png.Width * 3.125), (int)(png.Height * 3.125))) { // Apply filters // tess here } public static Bitmap ResizeImage(Image image, int width, int height) {

RE: [tesseract-ocr] Multiple Languages

2018-10-16 Thread MariamHi
I have convert dpi of image by imagemagick to be 300 dpi and try it again with same result. Resolution : Result : SimplifiedArabic "جوجل 600916 " يستعد لاقتحام أدمغتنا الاحد 9 سبتمبر 2018 الاقتصادية" من الرياض" هل حدث وأن بحثت عن منتج معين عبر الإنترنت وتفاجأت باقتراحات عديدة لاحقا على حساب

RE: [tesseract-ocr] Multiple Languages

2018-10-16 Thread Adrian Owen
try PageSegmentationMode.AUTO You may need to enlarge to 300, what’s original DPI? From: tesseract-ocr@googlegroups.com [mailto:tesseract-ocr@googlegroups.com] On Behalf Of MariamHi Sent: 16 October 2018 11:07 To: tesseract-ocr@googlegroups.com Subject: RE: [tesseract-ocr] Multiple Languages Ye

[tesseract-ocr] Re: GUI for Tesseract

2018-10-16 Thread flaviumarc
Look, I can easily create a MFC app, where you can insert the code what you need ... it cost nothing. I am a MFC enthusiast, and I am happy to help. On Tuesday, October 16, 2018 at 1:09:07 PM UTC+3, Mugunthan wrote: > > No, I haven't. Is there any other way to do this? > > > > On Tuesday, October

[tesseract-ocr] Re: GUI for Tesseract

2018-10-16 Thread flaviumarc
In C++ yes, there is another option: Qt. In other languages, I don't know. On Tuesday, October 16, 2018 at 1:09:07 PM UTC+3, Mugunthan wrote: > > No, I haven't. Is there any other way to do this? > > > > On Tuesday, October 16, 2018 at 7:59:20 AM UTC+5:30, Mugunthan wrote: >> >> Hi, >> >> How can

Re: [tesseract-ocr] Re: train more fonts on trained model fas in tesseract

2018-10-16 Thread kislay bajpai
Hello, Thanks for prompt reply, I want to train tesseract 4.0 alpha for font E13B. How could i train? Please share the knowledge. On Tuesday, October 16, 2018 at 1:57:17 PM UTC+5:30, Soumik Ranjan Dasgupta wrote: > > Please see > https://github.com/tesseract-ocr/tesseract/wiki/Fonts#fonts-for

[tesseract-ocr] Re: GUI for Tesseract

2018-10-16 Thread Mugunthan
No, I haven't. Is there any other way to do this? On Tuesday, October 16, 2018 at 7:59:20 AM UTC+5:30, Mugunthan wrote: > > Hi, > > How can I develop a GUI Application with my traineddata files. I've > trained in LSTM and 3.05 and need to embed in a desktop application. How > can I do that??

[tesseract-ocr] Re: train more fonts on trained model fas in tesseract

2018-10-16 Thread kislay . bajpai
On Monday, May 14, 2018 at 7:15:15 PM UTC+5:30, reza wrote: > > hi > i tested tesseract 4 beta on persian lang , the results was good. but i > think needs more training on more fonts and texts. > how could we train more fonts and texts on model that exist in tesseract 4 > beta for persian lang

Re: [tesseract-ocr] Re: train more fonts on trained model fas in tesseract

2018-10-16 Thread Kislay Bajpai
Hello, Thanks for answering my question. I am not getting how could i give training to Tesseract4.0 alpha for a new font. Please help me out on this particular issue. -- Thanks & Regards *Kislay Bajpai* Software Programmer Image InfoSystems Private Limited B-38, First Floor Kalkaji New Delhi -

Re: [tesseract-ocr] pixRead problem

2018-10-16 Thread flaviumarc
No, right now I have compiled Cppan, and try to figure out what I have to do next ... On Tuesday, October 16, 2018 at 12:29:33 PM UTC+3, zdenop wrote: > > You will do everything including complaining but not to read and follow > instructs. Right? ;-) > > https://github.com/tesseract-ocr/tesserac

RE: [tesseract-ocr] Multiple Languages

2018-10-16 Thread Adrian Owen
Try changing order: English+Arabic Any better ? From: tesseract-ocr@googlegroups.com [mailto:tesseract-ocr@googlegroups.com] On Behalf Of MariamHi Sent: 16 October 2018 08:27 To: tesseract-ocr@googlegroups.com Subject: RE: [tesseract-ocr] Multiple Languages When I did pre-processing I get resul

Re: [tesseract-ocr] pixRead problem

2018-10-16 Thread Zdenko Podobny
You will do everything including complaining but not to read and follow instructs. Right? ;-) https://github.com/tesseract-ocr/tesseract/wiki/Compiling#windows Zdenko ut 16. 10. 2018 o 10:52 napísal(a): > It is a endless story :) > > I have downloded from here cppan, and I have tried to gene

Re: [tesseract-ocr] pixRead problem

2018-10-16 Thread flaviumarc
It is a endless story :) I have downloded from here cppan, and I have tried to generate a .sln file with CMake ... but I get the following errors: CMake Error at CMakeLists.txt:130 (find_package): By not providing "FindCPPAN.cmake" in CMAKE_MODULE_PATH this project has asked CMake to find a pac

[tesseract-ocr] Re: GUI for Tesseract

2018-10-16 Thread flaviumarc
Have you worked in MFC/VC++ ? On Tuesday, October 16, 2018 at 5:29:20 AM UTC+3, Mugunthan wrote: > > Hi, > > How can I develop a GUI Application with my traineddata files. I've > trained in LSTM and 3.05 and need to embed in a desktop application. How > can I do that?? > -- You received this

Re: [tesseract-ocr] pixRead problem

2018-10-16 Thread Zdenko Podobny
most easy way for you would be to compile tesseract on windows with cppan. instruction are on wiki... Dňa ut 16. 10. 2018, 10:14 napísal(a): > Thank you a lot for your prompt answer ! I really appreciate that ! > > I have run in cmd line: tesseract --help-extra, I don't spot any graphic > librar

Re: [tesseract-ocr] Re: train more fonts on trained model fas in tesseract

2018-10-16 Thread Soumik Ranjan Dasgupta
Please see https://github.com/tesseract-ocr/tesseract/wiki/Fonts#fonts-for-tesseract-training . On Tue, Oct 16, 2018 at 1:49 PM wrote: > Hello all, > > I want to train tesseract 4.0 alpha for a new font, is there anyone who > can help me on this topic. > > On Monday, May 14, 2018 at 7:15:15 PM U

[tesseract-ocr] Re: GUI for Tesseract

2018-10-16 Thread flaviumarc
Hi. I can help you, with a desktop app that will do what you need. Just give me details regarding your task ... ex. What you need to display the desktop app, etc. Or, we can build together this app ... it is up to you. On Tuesday, October 16, 2018 at 5:29:20 AM UTC+3, Mugunthan wrote: > > Hi, >

[tesseract-ocr] Re: GUI for Tesseract

2018-10-16 Thread flaviumarc
We can do this in VC++/MFC. On Tuesday, October 16, 2018 at 5:29:20 AM UTC+3, Mugunthan wrote: > > Hi, > > How can I develop a GUI Application with my traineddata files. I've > trained in LSTM and 3.05 and need to embed in a desktop application. How > can I do that?? > -- You received this me

[tesseract-ocr] Re: train more fonts on trained model fas in tesseract

2018-10-16 Thread kislay . bajpai
Hello all, I want to train tesseract 4.0 alpha for a new font, is there anyone who can help me on this topic. On Monday, May 14, 2018 at 7:15:15 PM UTC+5:30, reza wrote: > > hi > i tested tesseract 4 beta on persian lang , the results was good. but i > think needs more training on more fonts a

Re: [tesseract-ocr] pixRead problem

2018-10-16 Thread flaviumarc
Thank you a lot for your prompt answer ! I really appreciate that ! I have run in cmd line: tesseract --help-extra, I don't spot any graphic library option. I have to tell you that I am using Windows10, and I have compiled leptonica with VS2017, taken from here: https://github.com/danbloomberg/

[tesseract-ocr] Re: Heads up: release of tesseract 4.0

2018-10-16 Thread Zdenko Podobny
Is here anybody who build&use tesseract on Android? I would like to solve: https://github.com/tesseract-ocr/tesseract/issues/1393 Zdenko ne 14. 10. 2018 o 19:48 Zdenko Podobny napísal(a): > RC 3 is ready[1]. > > Please test, test, test. > Especially if you are building tesseract on other platf

Re: [tesseract-ocr] Making custom traineddata

2018-10-16 Thread Soumik Ranjan Dasgupta
You should uninstall (purge) v3 first. Then build the v4 from scratch. On Tue, Oct 16, 2018 at 12:23 PM Vinod Gattani wrote: > Robert/ Zdenko > > Yes, in the log I see version "3.4v". > > To install v4, I used the link "https://github.com/tesseract-ocr/tesseract";. > I thought it has tesseract v

RE: [tesseract-ocr] Multiple Languages

2018-10-16 Thread MariamHi
When I did pre-processing I get result more bad, the idea is when I recognize document in Arabic I get it almost correct and when I recognize document in English I get it correct but when I recognize document in Arabic+English “Multiple” I get allEnglish word in digits .. how to fix it ? From:

Re: [tesseract-ocr] pixRead problem

2018-10-16 Thread Zdenko Podobny
Really? Where did you look??? What is output of leptonica "./configure --help" ??? What is printed on screen when you run leptonica configure? Zdenko ut 16. 10. 2018 o 9:03 napísal(a): > Hi zdenop. I have read here: > > > https://groups.google.com/forum/#!searchin/tesseract-ocr/Error$20in$20pi

Re: [tesseract-ocr] Making custom traineddata

2018-10-16 Thread Zdenko Podobny
You forget to uninstall tesseract 3.04 obviously. You can not have 2 installation of tesseract or you should know your system and have knowledge how to handle this kind of situation. What ever you do, you should understand what are you doing. Zdenko ut 16. 10. 2018 o 8:53 Vinod Gattani napísal

Re: [tesseract-ocr] pixRead problem

2018-10-16 Thread flaviumarc
Hi zdenop. I have read here: https://groups.google.com/forum/#!searchin/tesseract-ocr/Error$20in$20pixReadMem$3A$20tiff$3A$20no$20pix$20returned$20by$20tesseract%7Csort:date/tesseract-ocr/v_xZzoiUMUo/fMx9XZ-cBQAJ that someone who had the same issue like me, you told him that "you decided to bui