Hi Sandeep,

That is great news, and good job. OK, for some ideas about developing
your proposal, you may want to simply start with a Google Docs, and then
share it with Pei. I'd be happy to help co-mentor if Pei and you think
it's useful too.

Your proposal should likely cover:

1. Background - what's the state of CTAKES-189 and what's it trying to
accomplish
  (include some figures, etc. along with your text)

2. Approach - what are you going to do to solve CTAKES-189. Be specific,
and 
  try to break it down into smaller, easily reversible steps

3. Schedule - how long and what is the schedule for achieving this?

4. Risks/etc. - any known risks like are you taking a vacation anytime
soon :)
  or are there other time constraints?

5. References, etc.

HTH and I'd be happy if you want to share the GDocs with me as you develop
it.

Cheers!

Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattm...@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++






-----Original Message-----
From: sandeep rg <sandeep.f...@gmail.com>
Reply-To: "dev@ctakes.apache.org" <dev@ctakes.apache.org>
Date: Saturday, July 13, 2013 8:57 AM
To: "dev@ctakes.apache.org" <dev@ctakes.apache.org>
Subject: Re: to involve in your development group

>i have also gone through the technologies available for development of
>ocr,from that i think apache tika and tessearact is best for resolving the
>problem.
>
>
>On Sat, Jul 13, 2013 at 9:02 PM, sandeep rg <sandeep.f...@gmail.com>
>wrote:
>
>> hi Mattamann Chris,
>> i has participated in the event coordinated by luciano resende
>>
>> http://community.apache.org/mentoringprogramme-icfoss-pilot.html
>>
>> and from that i learned about open source and like to work on your
>>project
>> ctakes.i would like to fix the jira
>>
>> https://issues.apache.org/jira/browse/CTAKES-189
>>
>> chen pei accepted my requested to be my mentor.now i want to give a
>> proposal to apache about the project i am going to work on.can you help
>>me
>> to prepare a proposal to be submitted before 18 th of this july.
>>
>>
>>
>>
>>
>>
>> On Sat, Jul 13, 2013 at 2:26 AM, Mattmann, Chris A (398J) <
>> chris.a.mattm...@jpl.nasa.gov> wrote:
>>
>>> Hi Sandeep,
>>>
>>> I think the best thing to do is:
>>>
>>> 1. Develop a JIRA issue here:
>>> https://issues.apache.org/jira/browse/CTAKES
>>>  1a. you can register for a new account on JIRA
>>> 2. Once your JIRA issue is created, feel free to start a [DISCUSS]
>>>thread
>>> (e.g., with subject [DISCUSS] "some topic" where "some topic" is
>>>perhaps
>>> the main idea you have) on dev@ctakes.apache.org, referencing your
>>>issue
>>> and
>>> asking for feedback
>>> 3. Work with the Apache cTAKES PMC and committers to get your patches
>>>and
>>> other items attached to your issue from #1 committed into the sources
>>>
>>> Ideally if 1-3 happen and it's a good interaction, Apache is built on
>>> meritocracy and you could possibly earn the merit to become a PMC
>>>member
>>> or committer on the project.
>>>
>>> Cheers,
>>> Chris
>>>
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> Chris Mattmann, Ph.D.
>>> Senior Computer Scientist
>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>>> Office: 171-266B, Mailstop: 171-246
>>> Email: chris.a.mattm...@nasa.gov
>>> WWW:  http://sunset.usc.edu/~mattmann/
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> Adjunct Assistant Professor, Computer Science Department
>>> University of Southern California, Los Angeles, CA 90089 USA
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>
>>>
>>>
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: sandeep rg <sandeep.f...@gmail.com>
>>> Reply-To: "dev@ctakes.apache.org" <dev@ctakes.apache.org>
>>> Date: Thursday, July 11, 2013 11:30 AM
>>> To: "dev@ctakes.apache.org" <dev@ctakes.apache.org>
>>> Subject: Re: to involve in your development group
>>>
>>> >can you provide what all details i should include in a
>>>proposal?whether i
>>> >wanted to include all implemetation(technical) details in the
>>>proposal?
>>> >
>>> >
>>> >On Thu, Jul 11, 2013 at 9:45 PM, Mattmann, Chris A (398J) <
>>> >chris.a.mattm...@jpl.nasa.gov> wrote:
>>> >
>>> >> Dear Sandeep,
>>> >>
>>> >> Thanks for your interest in cTAKES. We would welcome your
>>>contribution
>>> >> and are happy to have your interest in the project.
>>> >>
>>> >> Cheers,
>>> >> Chris
>>> >>
>>> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> >> Chris Mattmann, Ph.D.
>>> >> Senior Computer Scientist
>>> >> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>>> >> Office: 171-266B, Mailstop: 171-246
>>> >> Email: chris.a.mattm...@nasa.gov
>>> >> WWW:  http://sunset.usc.edu/~mattmann/
>>> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> >> Adjunct Assistant Professor, Computer Science Department
>>> >> University of Southern California, Los Angeles, CA 90089 USA
>>> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> -----Original Message-----
>>> >> From: sandeep rg <sandeep.f...@gmail.com>
>>> >> Reply-To: "dev@ctakes.apache.org" <dev@ctakes.apache.org>
>>> >> Date: Wednesday, July 10, 2013 11:01 AM
>>> >> To: "dev@ctakes.apache.org" <dev@ctakes.apache.org>
>>> >> Subject: Re: to involve in your development group
>>> >>
>>> >> >sir,
>>> >> >
>>> >> >My name is sandeep rg.i am a btech graduate in computer science.now
>>> >>doing
>>> >> >an internship in a company in java language.
>>> >> >
>>> >> >then  i had installed all things succesfully,now downloading the
>>> >> >resource.ittake too much time.
>>> >> >
>>> >> >i have gone through the suggested ocr technologies.
>>> >> >Javaocr has some good user review.
>>> >> >Apache tika has a capability to process different types of format.
>>> >> >More than that there is tesserract which are also used for ocr
>>> purpose.
>>> >> >then apache pdfbox is also used for text extratcion but only for
>>>pdf
>>> >> >files.
>>> >> >now i am going through every thing to find out best technology from
>>> >>this.
>>> >> >
>>> >> >
>>> >> >On Wed, Jul 10, 2013 at 12:52 AM, Chen, Pei
>>> >> ><pei.c...@childrens.harvard.edu>wrote:
>>> >> >
>>> >> >> Hi Sandeep,
>>> >> >> I am delighted to work with you on this project.
>>> >> >>
>>> >> >> I was not sure if I understood you correctly- did you mean to say
>>> >>that
>>> >> >>you
>>> >> >> have already tried using cTAKES and it's components?
>>> >> >> If not, you can do an svn checkout of the code and try running
>>>the
>>> >> >> debugger gui from the command line (or eclipseide) that will
>>>allow
>>> >>you
>>> >> >>to
>>> >> >> type in plain text and get back the different structured content
>>> >>(types)
>>> >> >> that cTAKES produces:
>>> >> >> MAVEN_OPTS="-Xmx2g -Xms1g"
>>> >> >> mvn -PrunCVD compile
>>> >> >> From the guide:
>>> >> >>
>>> >> >>
>>> >>
>>> >>
>>> 
>>>https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.0+Developer+
>>>I
>>> >> >>nstall+Guide
>>> >> >>
>>> >> >> A bit of background:
>>> >> >> Apache cTAKES uses SVN for version on control:
>>> >> >> https://svn.apache.org/repos/asf/ctakes/trunk/
>>> >> >> Jira for issues tracking:
>>> >> >> https://issues.apache.org/jira/browse/ctakes
>>> >> >> Maven for building and dependency management.
>>> >> >> A lot of the developers use Eclipse IDE for their development.
>>> >> >> More info on ctakes.apache.org
>>> >> >>
>>> >> >> cTAKES is built on top of the Apache UIMA Framework.
>>>Essentially,
>>> >> >>cTAKES
>>> >> >> is a collection of Annotators (Java Classes) and wired together
>>>to
>>> >>into
>>> >> >>a
>>> >> >> pipeline.
>>> >> >> It's goal in a nutshell is to turn unstructured plain text into
>>> >> >> structured/normalized form and specially trained for medical
>>>notes.
>>> >> >> Right now- the input cTAKES expects would be in plain text form
>>>and
>>> >> >>cTAKES
>>> >> >> does not have an OCR component.
>>> >> >> cTAKE-189:GSoC:implement OCR/tika to standardize text inputs was
>>>an
>>> >>idea
>>> >> >> to allow cTAKES to take in any type of input (PDF, Images, Word,
>>> XLS,
>>> >> >>etc.)
>>> >> >> and pass the text for cTAKES processing.
>>> >> >> [I was originally thinking this could be done in some kind of
>>> >> >> preprocessing, or an optional Annotator that could be added in
>>>the
>>> >> >> beginning of a pipeline].  There may be some existing work that
>>> >>could be
>>> >> >> potentially reused: Apache Tika (
>>> >> >> https://issues.apache.org/jira/browse/TIKA-93 ) as well as some
>>> open
>>> >> >> source OCR toolkits (JavaOCR).
>>> >> >>
>>> >> >> About Me:
>>> >> >>
>>> >> >>
>>> >> >>
>>> >>
>>> >>
>>> 
>>>http://childrenshospital.org/cfapps/research/data_admin/Site3240/mainpag
>>>e
>>> >> >>S3240P8.html
>>> >> >> http://www.linkedin.com/in/peistation
>>> >> >> http://people.apache.org/committer-index.html#chenpei
>>> >> >>
>>> >> >> > -----Original Message-----
>>> >> >> > From: sandeep rg [mailto:sandeep.f...@gmail.com]
>>> >> >> > Sent: Tuesday, July 09, 2013 1:19 PM
>>> >> >> > To: dev@ctakes.apache.org
>>> >> >> > Subject: Re: to involve in your development group
>>> >> >> >
>>> >> >> > Thanks a lot for giving me support.i like to work with you.
>>> >> >> >
>>> >> >> > I have gone through the objectives of the software,used the
>>> >>software
>>> >> >>and
>>> >> >> > gone through various components of the project.can you provide
>>>me
>>> >> >> starting
>>> >> >> > point from where i should start to know more about the coding
>>>part
>>> >>of
>>> >> >>the
>>> >> >> > project.
>>> >> >> >
>>> >> >> > can you tell me more about the project and about you also?
>>> >> >> >
>>> >> >> >
>>> >> >> > On Tue, Jul 9, 2013 at 1:14 AM, Chen, Pei
>>> >> >> > <pei.c...@childrens.harvard.edu>wrote:
>>> >> >> >
>>> >> >> > > Hi Sandeep,
>>> >> >> > > Thank you for the interest.  I just had a quick look at the
>>> >>ICFOSS
>>> >> >> > > pilot mentoring program and will be happy to serve as a
>>>mentor
>>> >>for
>>> >> >> > > your project
>>> >> >> > > proposal(s) if you are interested.
>>> >> >> > >
>>> >> >> > > --Pei
>>> >> >> > >
>>> >> >> > > > -----Original Message-----
>>> >> >> > > > From: sandeep rg [mailto:sandeep.f...@gmail.com]
>>> >> >> > > > Sent: Monday, July 08, 2013 2:24 PM
>>> >> >> > > > To: dev@ctakes.apache.org
>>> >> >> > > > Subject: Re: to involve in your development group
>>> >> >> > > >
>>> >> >> > > > sir,
>>> >> >> > > >
>>> >> >> > > > details of the program Pilot mentoring programme with india
>>> >>ICFOSS
>>> >> >> > > > is
>>> >> >> > > given
>>> >> >> > > > in the below web address
>>> >> >> > > >
>>> >> >> > > >
>>> >>http://community.apache.org/mentoringprogramme-icfoss-pilot.html
>>> >> >> > > >
>>> >> >> > > >
>>> >> >> > > > I am new to this community so i need a mentor for the
>>> >>project.It
>>> >> >> > > > will be
>>> >> >> > > more
>>> >> >> > > > helpful for me..
>>> >> >> > > >
>>> >> >> > > >
>>> >> >> > > > On Mon, Jul 8, 2013 at 7:22 PM, Chen, Pei
>>> >> >> > > > <pei.c...@childrens.harvard.edu>wrote:
>>> >> >> > > >
>>> >> >> > > > > Hi Sandeep,
>>> >> >> > > > > Welcome!  I am not familiar with the details of
>>> >>icfoss-apache,
>>> >> >>but
>>> >> >> > > > > please- you are more than welcome to work on the code and
>>> >> >> > > > > contributions will be greatly appreciated!
>>> >> >> > > > > There may be a learning curve, but feel free let us know
>>>if
>>> >>you
>>> >> >> > > > > have any questions/issues.
>>> >> >> > > > > Thanks,
>>> >> >> > > > > Pei
>>> >> >> > > > >
>>> >> >> > > > > > -----Original Message-----
>>> >> >> > > > > > From: sandeep rg [mailto:sandeep.f...@gmail.com]
>>> >> >> > > > > > Sent: Saturday, July 06, 2013 11:50 AM
>>> >> >> > > > > > To: dev@ctakes.apache.org
>>> >> >> > > > > > Subject: to involve in your development group
>>> >> >> > > > > >
>>> >> >> > > > > >  my name is sandeep.i am btech graduate.i had
>>>participated
>>> >>in
>>> >> >>a
>>> >> >> > > > > > camp coordinated in kerala,India in association with
>>> >> >> > > > > > icfoss-apache called as
>>> >> >> > > > > youth
>>> >> >> > > > > > mentoring programme coordinated by Luciano resende.
>>> >> >> > > > > >
>>> >> >> > > > > >                                         i like the
>>>project
>>> >>and
>>> >> >> > > > > > like to
>>> >> >> > > > > involve in your project as a
>>> >> >> > > > > > programmer.i have gone through the your project and
>>>gone
>>> >> >>through
>>> >> >> > > > > > the bugs list.I like to work on the bug
>>> >> >> > > > > > "cTAKE-189:GSoC:implement OCR/tika to standardize text
>>> >>inputs
>>> >> >> > > > > > for cTAKES".can you allow me to
>>> >> >> > > work
>>> >> >> > > > on that?
>>> >> >> > > > >
>>> >> >> > >
>>> >> >>
>>> >>
>>> >>
>>>
>>>
>>

Reply via email to