Re: [tesseract-ocr] Tesseract convert image to gibberish

2018-03-03 Thread ShreeDevi Kumar
Recommendation from Ray is to use tessdata_fast On Sat, Mar 3, 2018 at 11:27 PM, Dusayanta Prasad wrote: > Which produces the better result- tessdata_fast or tessdata_best? > > On Saturday, March 3, 2018 at 6:26:58 PM UTC+5:30, shree wrote: >> >> The exact directory will depend both on the typ

Re: [tesseract-ocr] Tesseract convert image to gibberish

2018-03-03 Thread Dusayanta Prasad
Which produces the better result- tessdata_fast or tessdata_best? On Saturday, March 3, 2018 at 6:26:58 PM UTC+5:30, shree wrote: > > The exact directory will depend both on the type of training data, and > your Linux distribtion. Possibilities are > /usr/share/tesseract-ocr/tessdata or /usr/sha

Re: [tesseract-ocr] Tesseract convert image to gibberish

2018-03-03 Thread ShreeDevi Kumar
The exact directory will depend both on the type of training data, and your Linux distribtion. Possibilities are /usr/share/tesseract-ocr/tessdata or /usr/share/tessdata or /usr/share/tesseract-ocr/4.00/tessdata. ShreeDevi भजन - कीर्तन

Re: [tesseract-ocr] Tesseract convert image to gibberish

2018-03-03 Thread ShreeDevi Kumar
Also check tesseract --list-langs ShreeDevi भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com On Sat, Mar 3, 2018 at 6:22 PM, ShreeDevi Kumar wrote: > ls -l /home/dusayanta/tesseract/tessdata/eng.traineddata > > combine_tessdata -

Re: [tesseract-ocr] Tesseract convert image to gibberish

2018-03-03 Thread ShreeDevi Kumar
ls -l /home/dusayanta/tesseract/tessdata/eng.traineddata combine_tessdata -d /home/dusayanta/tesseract/tessdata/eng.traineddata ShreeDevi भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com On Sat, Mar 3, 2018 at 5:57 PM, ShreeDevi K

Re: [tesseract-ocr] Tesseract convert image to gibberish

2018-03-03 Thread ShreeDevi Kumar
No, I had not pre-processed the iame. I used tessdata_fast NOT tessdata_best.​ ShreeDevi भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com On Sat, Mar 3, 2018 at 3:59 PM, Dusayanta Prasad wrote: > Please tell me one more thing. Bef

Re: [tesseract-ocr] Tesseract convert image to gibberish

2018-03-03 Thread Dusayanta Prasad
Please help with this: dusayanta@dusayanta:~/tessy$ tesseract -v tesseract 4.00.00dev-731-gb9b08c7 leptonica-1.75.3 libjpeg 8d (libjpeg-turbo 1.4.2) : libpng 1.2.54 : libtiff 4.0.6 : zlib 1.2.8 Found AVX Found SSE dusayanta@dusayanta:~/tessy$ tesseract book.tif book -l eng Error opening d

Re: [tesseract-ocr] Tesseract convert image to gibberish

2018-03-03 Thread Dusayanta Prasad
Please help with this: dusayanta@dusayanta:~/tessy$ tesseract book.tif book -l eng Error opening data file /home/dusayanta/tesseract/tessdata/eng.traineddata Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory. Failed loading language 'eng' Tesseract could

Re: [tesseract-ocr] Tesseract convert image to gibberish

2018-03-03 Thread Dusayanta Prasad
Please tell me one more thing. Before feeding the image to tesseract do you perform any kind of pre-processing like binarising the image or something like that? I didn't get the same result as yours even after trying Tesseract 4 with eng tessdata_best. On Saturday, March 3, 2018 at 3:38:07 PM

Re: [tesseract-ocr] Tesseract convert image to gibberish

2018-03-03 Thread shree
Sure, if you are comfortable building software on Linux. You have to make sure you have all the dependencies etc. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email

Re: [tesseract-ocr] Tesseract convert image to gibberish

2018-03-03 Thread Dusayanta Prasad
What if i build Leptonica and Tesseract from source following the method on GitHub?? -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@

Re: [tesseract-ocr] Tesseract convert image to gibberish

2018-02-26 Thread ShreeDevi Kumar
You can download latest version of tesseract-ocr and appropriate traineddata from https://launchpad.net/~alex-p/+archive/ubuntu/tesseract-ocr I ran tesseract via command line with default values. You may need to remove the existing old version, before installing new. On 27-Feb-2018 1:14 AM, "D

Re: [tesseract-ocr] Tesseract convert image to gibberish

2018-02-26 Thread Dusayanta Prasad
I am using tesseract in ubuntu command line, the version is tesseract 3.04.01 leptonica-1.73 libgif 5.1.2 : libjpeg 8d (libjpeg-turbo 1.4.2) : libpng 1.2.54 : libtiff 4.0.6 : zlib 1.2.8 : libwebp 0.4.4 : libopenjp2 2.1.0 Regarding the part of gibberish text, i had to convert the image to .tif

Re: [tesseract-ocr] Tesseract convert image to gibberish

2018-02-26 Thread Dusayanta Prasad
Can you please send me the link for Tesseract 4? Tell me the method you used to perform the OCR On Sunday, February 25, 2018 at 9:48:32 PM UTC+5:30, shree wrote: > > which version of tesseract are you using? > > See attached results with Tesseract 4 and eng from tessdata_fast > > > > ShreeDevi > _

Re: [tesseract-ocr] Tesseract convert image to gibberish

2018-02-25 Thread Greg Dunkel
Probably the scan is at too low dpi. Also slightly skewed. On Sun, Feb 25, 2018 at 5:38 AM, Dusayanta Prasad wrote: > I am try to convert the below image using Tesseract in linux using the > following command: > > tesseract img.jpg out -l eng > > >

Re: [tesseract-ocr] Tesseract convert image to gibberish

2018-02-25 Thread ShreeDevi Kumar
which version of tesseract are you using? See attached results with Tesseract 4 and eng from tessdata_fast ShreeDevi भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com On Sun, Feb 25, 2018 at 8:16 PM, Zdenko Podobny wrote: > https

Re: [tesseract-ocr] Tesseract convert image to gibberish

2018-02-25 Thread Zdenko Podobny
https://github.com/tesseract-ocr/tesseract/wiki/ImproveQuality Zdenko 2018-02-25 11:38 GMT+01:00 Dusayanta Prasad : > I am try to convert the below image using Tesseract in linux using the > following command: > > tesseract img.jpg out -l eng > > >

[tesseract-ocr] Tesseract convert image to gibberish

2018-02-25 Thread Dusayanta Prasad
I am try to convert the below image using Tesseract in linux using the following command: tesseract img.jpg out -l eng and i am getting the result like this