$ combine_tessdata -u ./third_party/tesseract/tessdata/
kan.traineddata ./kan.
Extracting tessdata components from ./third_party/tesseract/tessdata/
kan.traineddata
Wrote ./kan.unicharset
Wrote ./kan.inttemp
Wrote ./kan.pffmtable
Wrote ./kan.normproto
Wrote ./kan.punc-dawg
Wrote ./kan.word-dawg
Wrote ./kan.number-dawg
Wrote ./kan.freq-dawg

$ ls kan.*
kan.freq-dawg  kan.inttemp  kan.normproto  kan.number-dawg
 kan.pffmtable  kan.punc-dawg  kan.unicharset  kan.word-dawg

$ dawg2wordlist kan.unicharset kan.word-dawg word.wordlist
Loading word list from kan.word-dawg
Reading squished dawg
Word list loaded.

$ wc -l word.wordlist
18720 word.wordlist

Looks like there are 18,720 words in the Kannada word dawg, safely
uncompressed...



On Mar 7, 8:43 am, "Sriranga(78yrs)" <withblessing.sriranga.
1...@gmail.com> wrote:
> David,
> just now I checked with kan.punc-dawg(1KB) and kan.number-dawg(1KB) also.
> it works fine In both cases the output were not empty. Only
> word-dawg(181KB) and freq-dawg(2KB) does not work but with M$ windows's exe
> encounter message were displayed.
> this is brought to your kind notice. Even attached files of kan.word-dawg
> and kan.freq.dawg - for your investigation and valuable guidance.
> With warmest regards,
> -sriranga(79yrs)
>
> On Wed, Mar 7, 2012 at 9:44 AM, Sriranga(78yrs) <
>
>
>
>
>
>
>
> withblessing.sriranga.1...@gmail.com> wrote:
> > David,
> > Thanks for the valuable guidance.
> > Copied dawg2wordlist.exe pasted in the folder n:\Newfolder\ wherein
> > extracted files  Kan.unicharset, kan.word-dawg, kan.freq-dawg are located.
>
> > extract of cmd is reproduced below - with encounter.exe windows messages
> > displayed for word-dawg and freq-dawg.
> > M:\New Folder>dawg2wordlist.exe -h
> > Print all the words in a given dawg.
> > Usage: dawg2wordlist.exe <unicharset> <dawgfile> <wordlistfile>
>
> > M:\New Folder>dawg2wordlist.exe kan.unicharset kan.word-dawg testwordlist
> > Loading word list from kan.word-dawg
> > Reading squished dawg
>
> > M:\New Folder>dawg2wordlist.exe kan.unicharset kan.freq-dawg testwordlist
> > Loading word list from kan.freq-dawg
> > Reading squished dawg
> > Word list loaded.
> > M:\New Folder>
>
> >    [Note: testwordlist contains 0(zero)kb for kan.freq-dawg which contains
> > 2KB -
> >      whereas testwordlist did not generate for kan.word-dawg which
> > contains 181KB]
> > Awaiting further valuable guidance.
> > With regards,
> > -sriranga(79yrs)
>
> > Still i could not understand where I made mistake?
> > With regards,
> > -sriranga(79yrs)
>
> > On Wed, Mar 7, 2012 at 2:41 AM, David Eger <david.e...@gmail.com> wrote:
>
> >> Where you put wordlist2dawg.exe, try putting the name of the output list
> >> instead.
>
> >> On Friday, March 2, 2012 2:39:33 AM UTC-8, sriranga(79yrsold) wrote:
>
> >>> I had extracted kan.word-dawg from the Kan.traineddata. I am trying to
> >>> convert dawg to wordlist using commandline in cmd as follows:
>
> >>> ***M:\r684\BuildFolder\tesseract-ocr>dawg2wordlist "m:\New
> >>> Folder\kan.unicharset" "
> >>> m:\New Folder\kan.word-dawg" wordlist2dawg.exe
> >>> Loading word list from m:\New Folder\kan.word-dawg
> >>> Reading squished dawg
>
> >>> M:\r684\BuildFolder\tesseract-ocr>
> >>> *
> >>> Unfortunately windows encounter exe displayed. Where I made a mistake?
> >>> Awaiting solution?
>
> >>  --
> >> You received this message because you are subscribed to the Google
> >> Groups "tesseract-ocr" group.
> >> To post to this group, send email to tesseract-ocr@googlegroups.com
> >> To unsubscribe from this group, send email to
> >> tesseract-ocr+unsubscr...@googlegroups.com
> >> For more options, visit this group at
> >>http://groups.google.com/group/tesseract-ocr?hl=en
>
>
>
>  kan.word-dawg
> 243KViewDownload
>
>  kan.freq-dawg
> 2KViewDownload
>
>  kan.punc-dawg
> < 1KViewDownload
>
>  kan.number-dawg
> < 1KViewDownload

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to tesseract-ocr@googlegroups.com
To unsubscribe from this group, send email to
tesseract-ocr+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to