[tesseract-ocr] Re: Meter Reading with tesseract

2017-06-22 Thread Ngoc Thanh Huynh
Hi Jess, I know it was a long time ago but are you still playing with Tesseract? I am currently doing my final year project in uni and I am trying to use Tesseract which doesn't give me a high successful rate (like 20%). I really desire to know how you do your preprocessing and training data.

[tesseract-ocr] I am looking for the best way to OCR scan sports scoreboards (such as stadium scoreboards) for such items as time and scores

2017-06-22 Thread frank
I am experimenting with Tesseract which does not do well but maybe I can train it. Any hints if this is possible or a better way of getting times and scores from sport scoreboards. scoreboards similar to https://www.google.com/search?b

[tesseract-ocr] tesseract ViewerDebugg

2017-06-22 Thread sfo
hello! can someone please help me to extract the short list that tesseract uses in classification. if ound that in this paper "an overview of tesseract ocr engine".they say that "classification proceeds as a two step process.In the first step a class pruner creats a short list of character clas

Re: [tesseract-ocr] Fine Tuning Iterations

2017-06-22 Thread Ibr
how can I know how many lines in each lstmf file? I opened one with the notepad ++ and it was almost 7 line, and that can't be correct since I tried 61 font with 10 iterations > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To uns

Re: [tesseract-ocr] Fine Tuning Iterations

2017-06-22 Thread Ibr
thanks On Thursday, June 22, 2017 at 1:01:13 PM UTC+3, shree wrote: > > >what is the number of the iterations that will for sure cover the 40 > lstmf files? > > It will depend on number of lines in each file eg. If each file has 1000 > lines, then 40,000 iterations should cover all files once. >

Re: [tesseract-ocr] Fine Tuning Iterations

2017-06-22 Thread ShreeDevi Kumar
>what is the number of the iterations that will for sure cover the 40 lstmf files? It will depend on number of lines in each file eg. If each file has 1000 lines, then 40,000 iterations should cover all files once. You can use --target_error_rate 0.01 instead of number of iterations as a guide

[tesseract-ocr] Fine Tuning Iterations

2017-06-22 Thread Ibr
Hi, if I want to run the command: training/lstmtraining --model_output ~/tesstutorial/full_japanese/new \ --continue_from ~/tesstutorial/extracted_lstm/jpn.lstm \ --train_listfile ~/tesstutorial/jpntrain/jpn.training_files.txt \ --max_iterations 10 how can I match the --max_iterations

[tesseract-ocr] print regions (layout analysis results)

2017-06-22 Thread Andrey Razumovsky
Hello, I'm trying to see how Tesseract detects certain layout elements, e.g. tables. I am able to see the regions in ScrollView, e.g.: tesseract.exe -c textord_show_tables=true test.png test segdemo inter this works well - I can see tables being painted with different color. Now I am trying to

Re: [tesseract-ocr] Need help training Simplified Chinese.

2017-06-22 Thread ShreeDevi Kumar
Your best bet for improving recognition is to preprocess the small and medium images to larger size. Please see https://github.com/tesseract-ocr/tesseract/wiki/ImproveQuality Tesseract 4.00.00alpha currently has two different ocr engines in it. The legacy tesseract engine is accessible with --oem