[tesseract-ocr] Re: Handwriting training

2018-11-26 Thread DreadStarX
Afaik, tesseract doesn't do handwriting. I could be mistaken, there's 
another application that scans handwriting.

On Monday, November 26, 2018 at 4:40:48 AM UTC-8, Rob wrote:
>
> Hello everyone,
>
> I am currently working on making a scanned fillable text document readable 
> for the computer. This document can be filled in with computer writing as 
> well as with handwriting. The quality of the scanned document is good 
> enough and the font is not too small. I'm sing Ubuntu 18.04, Python 3 and 
> Tesseract 4.0.
>
> What is the best way to recognize both types of font (in particular 
> handwriting)? Do you have some easy steps for me to archieve the Training 
> for this Problem?
> I found this "https://github.com/OCR-D/ocrd-train;, it seems to make the 
> Training Process a lot easier right?
>
> Thanks in advance and best wishes.
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/5b0553d0-1fae-4b5b-a8a6-01f058d1c337%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[tesseract-ocr] Re: Text Extraction from complex Table

2018-11-20 Thread DreadStarX
Since the English language does not have a border, I don't think tesseract 
will know how to handle it. You'll need to tell it how to add the borders 
to everything.

I had a similar problem, except I was pulling from a much larger complex 
table, I was using | as the border, and it kept adding a capital i for 
everything. 

On Saturday, November 17, 2018 at 1:39:50 AM UTC-8, Soumen Seth wrote:
>
> Hi Everyone,
>
> I am working on *python 2.7* and *pytesseract*. My tasseract version - 
>
> tesseract 4.0.0-beta.1
>  leptonica-1.75.3
>   libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.5.2) : libpng 1.6.34 : 
> libtiff 4.0.9 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.0
>
> I am trying to extract text from a table with tesseract. But I am unable 
> to extract the texts properly. 
>
> I tried to extract texts from this table:
>
> [image: sample 2.png]
> and this is what I got:
>
> Time Table\n| Mon | Tue | Wed | Thu | Fri\n| Science | Maths [Science | Maths 
> | arts\nours S02! [History | English | Social | Sports\n\n \n\n \n\n \n\n 
> \n\nHe
>
>
> As you can see, this is far from satisfactory. *Can anyone please tell me how 
> to do it?*
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/6e467097-1707-47d4-a5ab-68c90df255fd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[tesseract-ocr] Extracting Text from Onscreen vs Image

2018-10-18 Thread DreadStarX
Hey Guys/Gals,

I'm working on an application to assist myself and colleagues in our day to 
day tasks. Here's what I want to know.

Can Tesseract extract text from onscreen without using an image? If yes, 
how fast can it read and decipher the text?

If Tesseract can't, then I planned on taking a scrolling screenshot, and 
having Tesseract run through that.

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/960a3bc2-acf1-4a62-a778-b51c99b9cf95%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.