Hallur Guðjónsson,
do you want the compaild Tesseract3.02.exe?
If it is I'll send it to you.
On 24 May 2012 00:19, Hallur Guðjónsson wrote:
> Yes please post it here somewhere and I will try to compile it myself.
>
> Thank you
>
> Sincerely
>
> Hallur Orn
>
>
> On Wednesday, May 23, 2012 8:17:
On Wed, May 23, 2012 at 11:10 PM, Falke wrote:
> From what I see, there is no traineddata for the Roman latin
> alphabet. Essentially, the current eng.traineddata's shortcoming is
> its lack of the macron diacritic.
>
> Is it possible to add the macron glyphs to the already-existing
> eng.traine
I agree that Abbyy will do the job more accurate out of the box and is
easier to get started with.
You may also want to have a look at this article:
http://www.splitbrain.org/blog/2010-06/15-linux_ocr_software_comparison
On Wednesday, May 23, 2012 9:03:31 PM UTC+4, Scott Oom wrote:
>
> We are wo
Yes please post it here somewhere and I will try to compile it myself.
Thank you
Sincerely
Hallur Orn
On Wednesday, May 23, 2012 8:17:26 PM UTC, zdpo wrote:
>
> Officially 3.02 is not released, so there is not official (windows) binary
> version (you should compile it by yourself)...
> Anyway
>From what I see, there is no traineddata for the Roman latin
alphabet. Essentially, the current eng.traineddata's shortcoming is
its lack of the macron diacritic.
Is it possible to add the macron glyphs to the already-existing
eng.traineddata? (the Ā, ā, Ē, ē, Ō, ō, Ū, ū)
---
On Wed, May 23, 2012 at 7:19 AM, Hallur Guðjónsson wrote:
> Yes I read it carefully but I understood wrong at first, is there some place
> to get the 3.02 windows version of tesseract? do I have to compile it myself
> (because I'm a dumbass and don't know how to do that)
Now that I have written s
On Wed, May 23, 2012 at 10:20 PM, Sven Pedersen wrote:
> Hei Hallur,
> You can get the isl.traineddata file from subversion (SVN):
> http://code.google.com/p/tesseract-ocr/source/browse/trunk/tessdata/?r=656
>
> You can perhaps use that language file with the 3.01 version.
no, he can not. this is
Hei Hallur,
You can get the isl.traineddata file from subversion (SVN):
http://code.google.com/p/tesseract-ocr/source/browse/trunk/tessdata/?r=656
You can perhaps use that language file with the 3.01 version. You can
get Microsoft's free compiler and follow the recipe on the Wiki,
though it might
Officially 3.02 is not released, so there is not official (windows) binary
version (you should compile it by yourself)...
Anyway I can post somewhere current svn build if needed (no support and
installer will be provided for this :-) ).
--
Zdenko
On Wed, May 23, 2012 at 4:19 PM, Hallur Guðjónsso
It is clear that, out of the box, Abbyy Fine Reader is more accurate.
It may well be still more accurate with training, maybe due to
post-processing. Many people who produce effective solutions on this
list use pre- and post-processing scripts to deal with various common
issues. With all that, Tess
Hi again,
I recently added a wordlist to my training, and was disappointed to
find that it didn't seem to substantially improve the results. I
suspect this is in significant part due to the unicharset not
recognising equivalent upper and lower case letters (and hence not
matching dictionary words
On Tue, May 22, 2012 at 05:21:23AM -0700, Galt wrote:
> On May 21, 2:04 am, Nick White wrote:
> > I've been suffering a very similar problem with some of the text I'm
> > training, which has several diacritics above and below glyphs. It
> > isn't infrequent to find quite a few lines of garbage whi
We are working on automated testing tools for applications and games.
We want to be able to verify various text in the UIs in different
languages and have been experimenting with Tesseract OCR and having a
lot of fun with it.
In 2007, Ray Smith mentioned that "Tesseract is now behind the leading
Yes I read it carefully but I understood wrong at first, is there some
place to get the 3.02 windows version of tesseract? do I have to compile it
myself (because I'm a dumbass and don't know how to do that)
Sincerely
Hallur Örn
On Wednesday, May 23, 2012 11:51:36 AM UTC, zdpo wrote:
>
> Did y
Did you read my reply carefully?
See also FAQ [1] (IMO line number is not important in this case).
[1]
http://code.google.com/p/tesseract-ocr/wiki/FAQ#actual_tessdata_num_entries_<=_TESSDATA_NUM_ENTRIES:Error:Ass
--
Zdenko
On Wed, May 23, 2012 at 1:19 PM, Hallur Guðjónsson wrote:
> Yeah I trie
Yeah I tried to run it through CMD to see what the error was, and it
gives me this:
actual_tessdata_num_entries_ <= TESSDATA_NUM_ENTRIES:Error:Assert
failed:in file ..\ccutil\tessdatamanager.cpp, line 48
The author of Subtitle Edit pointed to this website for acquiring new
language packs, but I d
Thanks, Zdenko!
I found most of those same links too.
FYI here is Tess3.01 output:
Dul
fé
na
Gréine
.
.
.
.
3
In a nutshell, Tess 3.01 outputs this pattern for each word:
Dul
And judging by pdfbeads code, tess 3.00 did something like this for
each word:
Dul
Hi,
I want to run Tesseract on a mobile device and therefore its important
for me to use as less memory as possible.
When i run Tesseract 3.01 with eng it uses about 8MB on
initialisation, eng.traineddata has a size of about 3mb
when i run it with japanese, it uses around 55MB with a
jpn.trainedd
Thanx Stane, Your the best.
On 22 May 2012 20:14, Stane wrote:
> Well ofcause you need to give the right path as parameter, and the
> outputpath must exist.
>
> I extracted it for you, since iam not sure with which tesseract
> version you are working, here are both:
> http://dl.dropbox.com/u/1028
19 matches
Mail list logo