2011/5/19 Mostafa <[email protected]> > Hi Again, > > Seems no body knows where it is hiding. > Should I contact with CIA agent ? lol >
If somebody is really interesting she/he can know answer ;-). Within 1 minute ;-) ([1] [2] [3]). BTW: there is Developers forum<http://groups.google.com/group/tesseract-dev> . > But I am kinda serious about the data. > There were several requests for training data (in forum, in issues). I did it too. There was no official reply to such requests. AFAIK Google is not obliged to release them. So I guess they have a reason for not providing them. On other hand this could be opportunity for tesseract community :-): to create alternative training set. As Ray mentioned ([3]) they use "more automated training process based on rendering text from fonts", so training base on "real world" scanned documents could be interesting (but more difficult) Zdenko [1] http://code.google.com/p/tesseract-ocr/people/list [2] http://code.google.com/p/tesseract-ocr/source/list [3] http://groups.google.com/group/tesseract-dev/msg/1cdf3ebe8743d935 > Mostafa > > On May 18, 2:43 am, Илья <[email protected]> wrote: > > He need for table that contains all supported alphabetics characters. > > Also, Parts of scanned books could not be protected by copyright. > > > > Can you give any contacts of "jpn.traindata" dev team? > > > > -- > > Best regards, > > Ilia. > > > > В Втр, 17/05/2011 в 18:24 +0200, zdenko podobny пишет: > > > > > > > > > > > > > > > > > > > > > On Tue, May 17, 2011 at 5:01 PM, Илья <[email protected]> wrote: > > > IMHO alphabets can't be protected by copyright. > > > > > Mostafa did not asked for an alphabets. He asked for 'all the tif > > > files that used for creating...' and content of tiff file (e.g. > > > scanned books) could be protected by copyright. > > > > > -- > > > Best regards, > > > Ilia. > > > > > В Втр, 17/05/2011 в 09:24 -0400, Dmitri Silaev пишет: > > > > > > I think copyright issues are preventing the dev team from > > > publishing > > > > these source files. However you can try to contact this > > > forum's > > > > moderator directly - he probably can take decision to share. > > > > > > -- > > > > Dmitri > > > > > > On Tue, May 17, 2011 at 4:58 AM, Mostafa > > > <[email protected]> wrote: > > > > > Hi, > > > > > > > I am interested to get all the tif files that used for > > > creating the > > > > >jpn.traindata. > > > > > I just want to see how many characters are supported in > > > that file. > > > > > Because I have some other Japanese characters that can't > > > be recognized > > > > > by > > > > > the tesseract OCR. > > > > > > > Does anybody know, where are those tif files ? > > > > > > > Thanks > > > > > > > -- > > > > > You received this message because you are subscribed to > > > the Google > > > > > Groups "tesseract-ocr" group. > > > > > To post to this group, send email to > > > [email protected] > > > > > To unsubscribe from this group, send email to > > > > > [email protected] > > > > > For more options, visit this group at > > > > >http://groups.google.com/group/tesseract-ocr?hl=en > > > > > -- > > > You received this message because you are subscribed to the > > > Google > > > Groups "tesseract-ocr" group. > > > To post to this group, send email to > > > [email protected] > > > To unsubscribe from this group, send email to > > > [email protected] > > > For more options, visit this group at > > > http://groups.google.com/group/tesseract-ocr?hl=en > > > > > -- > > > You received this message because you are subscribed to the Google > > > Groups "tesseract-ocr" group. > > > To post to this group, send email to [email protected] > > > To unsubscribe from this group, send email to > > > [email protected] > > > For more options, visit this group at > > >http://groups.google.com/group/tesseract-ocr?hl=en > > -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

