Mostafa should try to contact Ray directly, seriously.
Things may have changed over time

--
Dmitri





2011/5/19 zdenko podobny <[email protected]>:
>
> 2011/5/19 Mostafa <[email protected]>
>>
>> Hi Again,
>>
>> Seems no body knows where it is hiding.
>> Should I contact with CIA agent ? lol
>
> If somebody is really interesting she/he can know answer ;-). Within 1
> minute ;-) ([1] [2] [3]). BTW: there is Developers forum.
>
>>
>> But I am kinda serious about the data.
>
> There were several requests for training data (in forum, in issues). I did
> it too. There was no official reply to such requests. AFAIK Google is
> not obliged to release them. So I guess they have a reason for not providing
> them.
> On other hand this could be opportunity for tesseract community :-): to
> create alternative training set. As Ray mentioned ([3]) they use "more
> automated training process based on rendering text from fonts", so training
> base on "real world" scanned documents could be interesting (but more
> difficult)
>
> Zdenko
>
> [1] http://code.google.com/p/tesseract-ocr/people/list
> [2] http://code.google.com/p/tesseract-ocr/source/list
> [3] http://groups.google.com/group/tesseract-dev/msg/1cdf3ebe8743d935
>
>>
>> Mostafa
>>
>> On May 18, 2:43 am, Илья <[email protected]> wrote:
>> > He need for table that contains all supported alphabetics characters.
>> > Also, Parts of scanned books could not be protected by copyright.
>> >
>> > Can you give any contacts of "jpn.traindata" dev team?
>> >
>> > --
>> >         Best regards,
>> >          Ilia.
>> >
>> > В Втр, 17/05/2011 в 18:24 +0200, zdenko podobny пишет:
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > > On Tue, May 17, 2011 at 5:01 PM, Илья <[email protected]> wrote:
>> > >         IMHO alphabets can't be protected by copyright.
>> >
>> > > Mostafa did not asked for an alphabets. He asked for 'all the tif
>> > > files that used for creating...' and content of tiff file (e.g.
>> > > scanned books) could be protected by copyright.
>> >
>> > >         --
>> > >         Best regards,
>> > >         Ilia.
>> >
>> > >         В Втр, 17/05/2011 в 09:24 -0400, Dmitri Silaev пишет:
>> >
>> > >         > I think copyright issues are preventing the dev team from
>> > >         publishing
>> > >         > these source files. However you can try to contact this
>> > >         forum's
>> > >         > moderator directly - he probably can take decision to share.
>> >
>> > >         > --
>> > >         > Dmitri
>> >
>> > >         > On Tue, May 17, 2011 at 4:58 AM, Mostafa
>> > >         <[email protected]> wrote:
>> > >         > > Hi,
>> >
>> > >         > > I am interested to get all the tif files that used for
>> > >         creating the
>> > >         > >jpn.traindata.
>> > >         > > I just want to see how many characters are supported in
>> > >         that file.
>> > >         > > Because I have some other Japanese characters that can't
>> > >         be recognized
>> > >         > > by
>> > >         > > the tesseract OCR.
>> >
>> > >         > > Does anybody know, where are those tif files ?
>> >
>> > >         > > Thanks
>> >
>> > >         > > --
>> > >         > > You received this message because you are subscribed to
>> > >         the Google
>> > >         > > Groups "tesseract-ocr" group.
>> > >         > > To post to this group, send email to
>> > >         [email protected]
>> > >         > > To unsubscribe from this group, send email to
>> > >         > > [email protected]
>> > >         > > For more options, visit this group at
>> > >         > >http://groups.google.com/group/tesseract-ocr?hl=en
>> >
>> > >         --
>> > >         You received this message because you are subscribed to the
>> > >         Google
>> > >         Groups "tesseract-ocr" group.
>> > >         To post to this group, send email to
>> > >         [email protected]
>> > >         To unsubscribe from this group, send email to
>> > >         [email protected]
>> > >         For more options, visit this group at
>> > >        http://groups.google.com/group/tesseract-ocr?hl=en
>> >
>> > > --
>> > > You received this message because you are subscribed to the Google
>> > > Groups "tesseract-ocr" group.
>> > > To post to this group, send email to [email protected]
>> > > To unsubscribe from this group, send email to
>> > > [email protected]
>> > > For more options, visit this group at
>> > >http://groups.google.com/group/tesseract-ocr?hl=en
>>
>> --
>> You received this message because you are subscribed to the Google
>> Groups "tesseract-ocr" group.
>> To post to this group, send email to [email protected]
>> To unsubscribe from this group, send email to
>> [email protected]
>> For more options, visit this group at
>> http://groups.google.com/group/tesseract-ocr?hl=en
>
> --
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en
>

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to