sir,
i am providing proposal by two days.now i am mainly going through
ASF-ICFOSS gateway because if i gone through their way and my proposal is
get selected,ICFOSS will provide some sort of support such as
certificates,small financial support etc. to us.


but,main thing is i like programming,i like to explore through the new
technologies in coding and like to interact with the coding.so if my
proposal is got rejected,then also i like to work in your project as a
volunteer if you allow me..

now i am preparing a proposal,within 2 days i will submit it..Mattmann
chris helped me to know more about the format of proposal.


On Tue, Jul 16, 2013 at 8:12 PM, Chen, Pei
<[email protected]>wrote:

> Chris/Sandeep,
> According to ASF-ICFOSS, I believe the deadline for submitting proposals
> is this coming Friday (July 19).
> After which point, mentors will have 2 weeks to review and score/accept.
> Just curious, are we planning to follow the same process here?  Or since
> it's all volunteer work, technically- sandeep and still contribute code to
> the community and participate in the dev group here.
>
> Looking forward to it.
> --Pei
>
>
> > -----Original Message-----
> > From: sandeep rg [mailto:[email protected]]
> > Sent: Monday, July 15, 2013 1:05 PM
> > To: [email protected]
> > Subject: Re: to involve in your development group
> >
> > sir,
> > i gone through most of the ocr technologies and reached a conclusion.i
> > would like to use apache tika and java ocr for this pupose.
> >
> > Tessearact is a ocr tool,it can be used for extracting from multiple
> > languages.it is implemented in vc++.so it can acceded using java native
> > function.they provided another  tool tess4j but review says that it has
> > many bugs.
> >
> > Apache tika developed in java language.it can be used to extract text
> data
> > from .xls,word,txt,pdf and other many formats.it is easy for
> implementing
> > in project also.i have just gone through its implementation way.
> >
> > then about javaocr,its good for extrating text from a jpeg or scanned
> > images.we can train it with various fonts.more we train more will be its
> > accuracy but its speed will get decreased.i didn't find any particular
> > documentation for that.
> >
> >
> >
> > On Sun, Jul 14, 2013 at 9:18 PM, sandeep rg <[email protected]>
> > wrote:
> >
> > > thanks a lot for both of your support.I will do my best to find
> solution
> > > for jira problem.i will share the proposal with both of you..
> > >
> > >
> > >
> > > On Sun, Jul 14, 2013 at 1:46 AM, Chen, Pei
> > <[email protected]
> > > > wrote:
> > >
> > >> Sandeep,
> > >> Its great to have Chris on board as well- he was one of the
> coordinators
> > >> of GSoC.
> > >> Looking forward to it.
> > >>
> > >> Sent from my iPhone
> > >>
> > >> On Jul 13, 2013, at 12:24 PM, "Mattmann, Chris A (398J)" <
> > >> [email protected]> wrote:
> > >>
> > >> > Hi Sandeep,
> > >> >
> > >> > That is great news, and good job. OK, for some ideas about
> developing
> > >> > your proposal, you may want to simply start with a Google Docs, and
> > then
> > >> > share it with Pei. I'd be happy to help co-mentor if Pei and you
> think
> > >> > it's useful too.
> > >> >
> > >> > Your proposal should likely cover:
> > >> >
> > >> > 1. Background - what's the state of CTAKES-189 and what's it trying
> to
> > >> > accomplish
> > >> >  (include some figures, etc. along with your text)
> > >> >
> > >> > 2. Approach - what are you going to do to solve CTAKES-189. Be
> specific,
> > >> > and
> > >> >  try to break it down into smaller, easily reversible steps
> > >> >
> > >> > 3. Schedule - how long and what is the schedule for achieving this?
> > >> >
> > >> > 4. Risks/etc. - any known risks like are you taking a vacation
> anytime
> > >> > soon :)
> > >> >  or are there other time constraints?
> > >> >
> > >> > 5. References, etc.
> > >> >
> > >> > HTH and I'd be happy if you want to share the GDocs with me as you
> > >> develop
> > >> > it.
> > >> >
> > >> > Cheers!
> > >> >
> > >> > Chris
> > >> >
> > >> >
> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > ++++++++
> > >> > Chris Mattmann, Ph.D.
> > >> > Senior Computer Scientist
> > >> > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> > >> > Office: 171-266B, Mailstop: 171-246
> > >> > Email: [email protected]
> > >> > WWW:  http://sunset.usc.edu/~mattmann/
> > >> >
> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > ++++++++
> > >> > Adjunct Assistant Professor, Computer Science Department
> > >> > University of Southern California, Los Angeles, CA 90089 USA
> > >> >
> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > ++++++++
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> > -----Original Message-----
> > >> > From: sandeep rg <[email protected]>
> > >> > Reply-To: "[email protected]" <[email protected]>
> > >> > Date: Saturday, July 13, 2013 8:57 AM
> > >> > To: "[email protected]" <[email protected]>
> > >> > Subject: Re: to involve in your development group
> > >> >
> > >> >> i have also gone through the technologies available for development
> > of
> > >> >> ocr,from that i think apache tika and tessearact is best for
> resolving
> > >> the
> > >> >> problem.
> > >> >>
> > >> >>
> > >> >> On Sat, Jul 13, 2013 at 9:02 PM, sandeep rg
> > <[email protected]>
> > >> >> wrote:
> > >> >>
> > >> >>> hi Mattamann Chris,
> > >> >>> i has participated in the event coordinated by luciano resende
> > >> >>>
> > >> >>> http://community.apache.org/mentoringprogramme-icfoss-
> > pilot.html
> > >> >>>
> > >> >>> and from that i learned about open source and like to work on your
> > >> >>> project
> > >> >>> ctakes.i would like to fix the jira
> > >> >>>
> > >> >>> https://issues.apache.org/jira/browse/CTAKES-189
> > >> >>>
> > >> >>> chen pei accepted my requested to be my mentor.now i want to give
> > a
> > >> >>> proposal to apache about the project i am going to work on.can you
> > >> help
> > >> >>> me
> > >> >>> to prepare a proposal to be submitted before 18 th of this july.
> > >> >>>
> > >> >>>
> > >> >>>
> > >> >>>
> > >> >>>
> > >> >>>
> > >> >>> On Sat, Jul 13, 2013 at 2:26 AM, Mattmann, Chris A (398J) <
> > >> >>> [email protected]> wrote:
> > >> >>>
> > >> >>>> Hi Sandeep,
> > >> >>>>
> > >> >>>> I think the best thing to do is:
> > >> >>>>
> > >> >>>> 1. Develop a JIRA issue here:
> > >> >>>> https://issues.apache.org/jira/browse/CTAKES
> > >> >>>> 1a. you can register for a new account on JIRA
> > >> >>>> 2. Once your JIRA issue is created, feel free to start a
> [DISCUSS]
> > >> >>>> thread
> > >> >>>> (e.g., with subject [DISCUSS] "some topic" where "some topic" is
> > >> >>>> perhaps
> > >> >>>> the main idea you have) on [email protected], referencing
> > your
> > >> >>>> issue
> > >> >>>> and
> > >> >>>> asking for feedback
> > >> >>>> 3. Work with the Apache cTAKES PMC and committers to get your
> > patches
> > >> >>>> and
> > >> >>>> other items attached to your issue from #1 committed into the
> > sources
> > >> >>>>
> > >> >>>> Ideally if 1-3 happen and it's a good interaction, Apache is
> built on
> > >> >>>> meritocracy and you could possibly earn the merit to become a PMC
> > >> >>>> member
> > >> >>>> or committer on the project.
> > >> >>>>
> > >> >>>> Cheers,
> > >> >>>> Chris
> > >> >>>>
> > >> >>>>
> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > ++++++++
> > >> >>>> Chris Mattmann, Ph.D.
> > >> >>>> Senior Computer Scientist
> > >> >>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> > >> >>>> Office: 171-266B, Mailstop: 171-246
> > >> >>>> Email: [email protected]
> > >> >>>> WWW:  http://sunset.usc.edu/~mattmann/
> > >> >>>>
> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > ++++++++
> > >> >>>> Adjunct Assistant Professor, Computer Science Department
> > >> >>>> University of Southern California, Los Angeles, CA 90089 USA
> > >> >>>>
> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > ++++++++
> > >> >>>>
> > >> >>>>
> > >> >>>>
> > >> >>>>
> > >> >>>>
> > >> >>>>
> > >> >>>> -----Original Message-----
> > >> >>>> From: sandeep rg <[email protected]>
> > >> >>>> Reply-To: "[email protected]" <[email protected]>
> > >> >>>> Date: Thursday, July 11, 2013 11:30 AM
> > >> >>>> To: "[email protected]" <[email protected]>
> > >> >>>> Subject: Re: to involve in your development group
> > >> >>>>
> > >> >>>>> can you provide what all details i should include in a
> > >> >>>> proposal?whether i
> > >> >>>>> wanted to include all implemetation(technical) details in the
> > >> >>>> proposal?
> > >> >>>>>
> > >> >>>>>
> > >> >>>>> On Thu, Jul 11, 2013 at 9:45 PM, Mattmann, Chris A (398J) <
> > >> >>>>> [email protected]> wrote:
> > >> >>>>>
> > >> >>>>>> Dear Sandeep,
> > >> >>>>>>
> > >> >>>>>> Thanks for your interest in cTAKES. We would welcome your
> > >> >>>> contribution
> > >> >>>>>> and are happy to have your interest in the project.
> > >> >>>>>>
> > >> >>>>>> Cheers,
> > >> >>>>>> Chris
> > >> >>>>>>
> > >> >>>>>>
> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > ++++++++
> > >> >>>>>> Chris Mattmann, Ph.D.
> > >> >>>>>> Senior Computer Scientist
> > >> >>>>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> > >> >>>>>> Office: 171-266B, Mailstop: 171-246
> > >> >>>>>> Email: [email protected]
> > >> >>>>>> WWW:  http://sunset.usc.edu/~mattmann/
> > >> >>>>>>
> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > ++++++++
> > >> >>>>>> Adjunct Assistant Professor, Computer Science Department
> > >> >>>>>> University of Southern California, Los Angeles, CA 90089 USA
> > >> >>>>>>
> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > ++++++++
> > >> >>>>>>
> > >> >>>>>>
> > >> >>>>>>
> > >> >>>>>>
> > >> >>>>>>
> > >> >>>>>>
> > >> >>>>>> -----Original Message-----
> > >> >>>>>> From: sandeep rg <[email protected]>
> > >> >>>>>> Reply-To: "[email protected]" <[email protected]>
> > >> >>>>>> Date: Wednesday, July 10, 2013 11:01 AM
> > >> >>>>>> To: "[email protected]" <[email protected]>
> > >> >>>>>> Subject: Re: to involve in your development group
> > >> >>>>>>
> > >> >>>>>>> sir,
> > >> >>>>>>>
> > >> >>>>>>> My name is sandeep rg.i am a btech graduate in computer
> > >> science.now
> > >> >>>>>> doing
> > >> >>>>>>> an internship in a company in java language.
> > >> >>>>>>>
> > >> >>>>>>> then  i had installed all things succesfully,now downloading
> the
> > >> >>>>>>> resource.ittake too much time.
> > >> >>>>>>>
> > >> >>>>>>> i have gone through the suggested ocr technologies.
> > >> >>>>>>> Javaocr has some good user review.
> > >> >>>>>>> Apache tika has a capability to process different types of
> format.
> > >> >>>>>>> More than that there is tesserract which are also used for ocr
> > >> >>>> purpose.
> > >> >>>>>>> then apache pdfbox is also used for text extratcion but only
> for
> > >> >>>> pdf
> > >> >>>>>>> files.
> > >> >>>>>>> now i am going through every thing to find out best technology
> > >> from
> > >> >>>>>> this.
> > >> >>>>>>>
> > >> >>>>>>>
> > >> >>>>>>> On Wed, Jul 10, 2013 at 12:52 AM, Chen, Pei
> > >> >>>>>>> <[email protected]>wrote:
> > >> >>>>>>>
> > >> >>>>>>>> Hi Sandeep,
> > >> >>>>>>>> I am delighted to work with you on this project.
> > >> >>>>>>>>
> > >> >>>>>>>> I was not sure if I understood you correctly- did you mean to
> > say
> > >> >>>>>> that
> > >> >>>>>>>> you
> > >> >>>>>>>> have already tried using cTAKES and it's components?
> > >> >>>>>>>> If not, you can do an svn checkout of the code and try
> running
> > >> >>>> the
> > >> >>>>>>>> debugger gui from the command line (or eclipseide) that will
> > >> >>>> allow
> > >> >>>>>> you
> > >> >>>>>>>> to
> > >> >>>>>>>> type in plain text and get back the different structured
> content
> > >> >>>>>> (types)
> > >> >>>>>>>> that cTAKES produces:
> > >> >>>>>>>> MAVEN_OPTS="-Xmx2g -Xms1g"
> > >> >>>>>>>> mvn -PrunCVD compile
> > >> >>>>>>>> From the guide:
> > >> >>>>
> > >> >>>>
> > >>
> > https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.0+Develope
> > r+
> > >> >>>> I
> > >> >>>>>>>> nstall+Guide
> > >> >>>>>>>>
> > >> >>>>>>>> A bit of background:
> > >> >>>>>>>> Apache cTAKES uses SVN for version on control:
> > >> >>>>>>>> https://svn.apache.org/repos/asf/ctakes/trunk/
> > >> >>>>>>>> Jira for issues tracking:
> > >> >>>>>>>> https://issues.apache.org/jira/browse/ctakes
> > >> >>>>>>>> Maven for building and dependency management.
> > >> >>>>>>>> A lot of the developers use Eclipse IDE for their
> development.
> > >> >>>>>>>> More info on ctakes.apache.org
> > >> >>>>>>>>
> > >> >>>>>>>> cTAKES is built on top of the Apache UIMA Framework.
> > >> >>>> Essentially,
> > >> >>>>>>>> cTAKES
> > >> >>>>>>>> is a collection of Annotators (Java Classes) and wired
> together
> > >> >>>> to
> > >> >>>>>> into
> > >> >>>>>>>> a
> > >> >>>>>>>> pipeline.
> > >> >>>>>>>> It's goal in a nutshell is to turn unstructured plain text
> into
> > >> >>>>>>>> structured/normalized form and specially trained for medical
> > >> >>>> notes.
> > >> >>>>>>>> Right now- the input cTAKES expects would be in plain text
> > form
> > >> >>>> and
> > >> >>>>>>>> cTAKES
> > >> >>>>>>>> does not have an OCR component.
> > >> >>>>>>>> cTAKE-189:GSoC:implement OCR/tika to standardize text
> > inputs was
> > >> >>>> an
> > >> >>>>>> idea
> > >> >>>>>>>> to allow cTAKES to take in any type of input (PDF, Images,
> > Word,
> > >> >>>> XLS,
> > >> >>>>>>>> etc.)
> > >> >>>>>>>> and pass the text for cTAKES processing.
> > >> >>>>>>>> [I was originally thinking this could be done in some kind of
> > >> >>>>>>>> preprocessing, or an optional Annotator that could be added
> in
> > >> >>>> the
> > >> >>>>>>>> beginning of a pipeline].  There may be some existing work
> > that
> > >> >>>>>> could be
> > >> >>>>>>>> potentially reused: Apache Tika (
> > >> >>>>>>>> https://issues.apache.org/jira/browse/TIKA-93 ) as well as
> > some
> > >> >>>> open
> > >> >>>>>>>> source OCR toolkits (JavaOCR).
> > >> >>>>>>>>
> > >> >>>>>>>> About Me:
> > >> >>>>
> > >> >>>>
> > >>
> > http://childrenshospital.org/cfapps/research/data_admin/Site3240/mainpag
> > >> >>>> e
> > >> >>>>>>>> S3240P8.html
> > >> >>>>>>>> http://www.linkedin.com/in/peistation
> > >> >>>>>>>> http://people.apache.org/committer-index.html#chenpei
> > >> >>>>>>>>
> > >> >>>>>>>>> -----Original Message-----
> > >> >>>>>>>>> From: sandeep rg [mailto:[email protected]]
> > >> >>>>>>>>> Sent: Tuesday, July 09, 2013 1:19 PM
> > >> >>>>>>>>> To: [email protected]
> > >> >>>>>>>>> Subject: Re: to involve in your development group
> > >> >>>>>>>>>
> > >> >>>>>>>>> Thanks a lot for giving me support.i like to work with you.
> > >> >>>>>>>>>
> > >> >>>>>>>>> I have gone through the objectives of the software,used the
> > >> >>>>>> software
> > >> >>>>>>>> and
> > >> >>>>>>>>> gone through various components of the project.can you
> > provide
> > >> >>>> me
> > >> >>>>>>>> starting
> > >> >>>>>>>>> point from where i should start to know more about the
> > coding
> > >> >>>> part
> > >> >>>>>> of
> > >> >>>>>>>> the
> > >> >>>>>>>>> project.
> > >> >>>>>>>>>
> > >> >>>>>>>>> can you tell me more about the project and about you also?
> > >> >>>>>>>>>
> > >> >>>>>>>>>
> > >> >>>>>>>>> On Tue, Jul 9, 2013 at 1:14 AM, Chen, Pei
> > >> >>>>>>>>> <[email protected]>wrote:
> > >> >>>>>>>>>
> > >> >>>>>>>>>> Hi Sandeep,
> > >> >>>>>>>>>> Thank you for the interest.  I just had a quick look at the
> > >> >>>>>> ICFOSS
> > >> >>>>>>>>>> pilot mentoring program and will be happy to serve as a
> > >> >>>> mentor
> > >> >>>>>> for
> > >> >>>>>>>>>> your project
> > >> >>>>>>>>>> proposal(s) if you are interested.
> > >> >>>>>>>>>>
> > >> >>>>>>>>>> --Pei
> > >> >>>>>>>>>>
> > >> >>>>>>>>>>> -----Original Message-----
> > >> >>>>>>>>>>> From: sandeep rg [mailto:[email protected]]
> > >> >>>>>>>>>>> Sent: Monday, July 08, 2013 2:24 PM
> > >> >>>>>>>>>>> To: [email protected]
> > >> >>>>>>>>>>> Subject: Re: to involve in your development group
> > >> >>>>>>>>>>>
> > >> >>>>>>>>>>> sir,
> > >> >>>>>>>>>>>
> > >> >>>>>>>>>>> details of the program Pilot mentoring programme with
> > india
> > >> >>>>>> ICFOSS
> > >> >>>>>>>>>>> is
> > >> >>>>>>>>>> given
> > >> >>>>>>>>>>> in the below web address
> > >> >>>>>> http://community.apache.org/mentoringprogramme-icfoss-
> > pilot.html
> > >> >>>>>>>>>>>
> > >> >>>>>>>>>>>
> > >> >>>>>>>>>>> I am new to this community so i need a mentor for the
> > >> >>>>>> project.It
> > >> >>>>>>>>>>> will be
> > >> >>>>>>>>>> more
> > >> >>>>>>>>>>> helpful for me..
> > >> >>>>>>>>>>>
> > >> >>>>>>>>>>>
> > >> >>>>>>>>>>> On Mon, Jul 8, 2013 at 7:22 PM, Chen, Pei
> > >> >>>>>>>>>>> <[email protected]>wrote:
> > >> >>>>>>>>>>>
> > >> >>>>>>>>>>>> Hi Sandeep,
> > >> >>>>>>>>>>>> Welcome!  I am not familiar with the details of
> > >> >>>>>> icfoss-apache,
> > >> >>>>>>>> but
> > >> >>>>>>>>>>>> please- you are more than welcome to work on the code
> > and
> > >> >>>>>>>>>>>> contributions will be greatly appreciated!
> > >> >>>>>>>>>>>> There may be a learning curve, but feel free let us know
> > >> >>>> if
> > >> >>>>>> you
> > >> >>>>>>>>>>>> have any questions/issues.
> > >> >>>>>>>>>>>> Thanks,
> > >> >>>>>>>>>>>> Pei
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>>> -----Original Message-----
> > >> >>>>>>>>>>>>> From: sandeep rg [mailto:[email protected]]
> > >> >>>>>>>>>>>>> Sent: Saturday, July 06, 2013 11:50 AM
> > >> >>>>>>>>>>>>> To: [email protected]
> > >> >>>>>>>>>>>>> Subject: to involve in your development group
> > >> >>>>>>>>>>>>>
> > >> >>>>>>>>>>>>> my name is sandeep.i am btech graduate.i had
> > >> >>>> participated
> > >> >>>>>> in
> > >> >>>>>>>> a
> > >> >>>>>>>>>>>>> camp coordinated in kerala,India in association with
> > >> >>>>>>>>>>>>> icfoss-apache called as
> > >> >>>>>>>>>>>> youth
> > >> >>>>>>>>>>>>> mentoring programme coordinated by Luciano resende.
> > >> >>>>>>>>>>>>>
> > >> >>>>>>>>>>>>>                                        i like the
> > >> >>>> project
> > >> >>>>>> and
> > >> >>>>>>>>>>>>> like to
> > >> >>>>>>>>>>>> involve in your project as a
> > >> >>>>>>>>>>>>> programmer.i have gone through the your project and
> > >> >>>> gone
> > >> >>>>>>>> through
> > >> >>>>>>>>>>>>> the bugs list.I like to work on the bug
> > >> >>>>>>>>>>>>> "cTAKE-189:GSoC:implement OCR/tika to standardize
> > text
> > >> >>>>>> inputs
> > >> >>>>>>>>>>>>> for cTAKES".can you allow me to
> > >> >>>>>>>>>> work
> > >> >>>>>>>>>>> on that?
> > >> >
> > >>
> > >
> > >
>

Reply via email to