hi Mattamann Chris, i has participated in the event coordinated by luciano resende
http://community.apache.org/mentoringprogramme-icfoss-pilot.html and from that i learned about open source and like to work on your project ctakes.i would like to fix the jira https://issues.apache.org/jira/browse/CTAKES-189 chen pei accepted my requested to be my mentor.now i want to give a proposal to apache about the project i am going to work on.can you help me to prepare a proposal to be submitted before 18 th of this july. On Sat, Jul 13, 2013 at 2:26 AM, Mattmann, Chris A (398J) < chris.a.mattm...@jpl.nasa.gov> wrote: > Hi Sandeep, > > I think the best thing to do is: > > 1. Develop a JIRA issue here: https://issues.apache.org/jira/browse/CTAKES > 1a. you can register for a new account on JIRA > 2. Once your JIRA issue is created, feel free to start a [DISCUSS] thread > (e.g., with subject [DISCUSS] "some topic" where "some topic" is perhaps > the main idea you have) on dev@ctakes.apache.org, referencing your issue > and > asking for feedback > 3. Work with the Apache cTAKES PMC and committers to get your patches and > other items attached to your issue from #1 committed into the sources > > Ideally if 1-3 happen and it's a good interaction, Apache is built on > meritocracy and you could possibly earn the merit to become a PMC member > or committer on the project. > > Cheers, > Chris > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Chris Mattmann, Ph.D. > Senior Computer Scientist > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > Office: 171-266B, Mailstop: 171-246 > Email: chris.a.mattm...@nasa.gov > WWW: http://sunset.usc.edu/~mattmann/ > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Adjunct Assistant Professor, Computer Science Department > University of Southern California, Los Angeles, CA 90089 USA > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > > > > > -----Original Message----- > From: sandeep rg <sandeep.f...@gmail.com> > Reply-To: "dev@ctakes.apache.org" <dev@ctakes.apache.org> > Date: Thursday, July 11, 2013 11:30 AM > To: "dev@ctakes.apache.org" <dev@ctakes.apache.org> > Subject: Re: to involve in your development group > > >can you provide what all details i should include in a proposal?whether i > >wanted to include all implemetation(technical) details in the proposal? > > > > > >On Thu, Jul 11, 2013 at 9:45 PM, Mattmann, Chris A (398J) < > >chris.a.mattm...@jpl.nasa.gov> wrote: > > > >> Dear Sandeep, > >> > >> Thanks for your interest in cTAKES. We would welcome your contribution > >> and are happy to have your interest in the project. > >> > >> Cheers, > >> Chris > >> > >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >> Chris Mattmann, Ph.D. > >> Senior Computer Scientist > >> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > >> Office: 171-266B, Mailstop: 171-246 > >> Email: chris.a.mattm...@nasa.gov > >> WWW: http://sunset.usc.edu/~mattmann/ > >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >> Adjunct Assistant Professor, Computer Science Department > >> University of Southern California, Los Angeles, CA 90089 USA > >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >> > >> > >> > >> > >> > >> > >> -----Original Message----- > >> From: sandeep rg <sandeep.f...@gmail.com> > >> Reply-To: "dev@ctakes.apache.org" <dev@ctakes.apache.org> > >> Date: Wednesday, July 10, 2013 11:01 AM > >> To: "dev@ctakes.apache.org" <dev@ctakes.apache.org> > >> Subject: Re: to involve in your development group > >> > >> >sir, > >> > > >> >My name is sandeep rg.i am a btech graduate in computer science.now > >>doing > >> >an internship in a company in java language. > >> > > >> >then i had installed all things succesfully,now downloading the > >> >resource.ittake too much time. > >> > > >> >i have gone through the suggested ocr technologies. > >> >Javaocr has some good user review. > >> >Apache tika has a capability to process different types of format. > >> >More than that there is tesserract which are also used for ocr purpose. > >> >then apache pdfbox is also used for text extratcion but only for pdf > >> >files. > >> >now i am going through every thing to find out best technology from > >>this. > >> > > >> > > >> >On Wed, Jul 10, 2013 at 12:52 AM, Chen, Pei > >> ><pei.c...@childrens.harvard.edu>wrote: > >> > > >> >> Hi Sandeep, > >> >> I am delighted to work with you on this project. > >> >> > >> >> I was not sure if I understood you correctly- did you mean to say > >>that > >> >>you > >> >> have already tried using cTAKES and it's components? > >> >> If not, you can do an svn checkout of the code and try running the > >> >> debugger gui from the command line (or eclipseide) that will allow > >>you > >> >>to > >> >> type in plain text and get back the different structured content > >>(types) > >> >> that cTAKES produces: > >> >> MAVEN_OPTS="-Xmx2g -Xms1g" > >> >> mvn -PrunCVD compile > >> >> From the guide: > >> >> > >> >> > >> > >> > https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.0+Developer+I > >> >>nstall+Guide > >> >> > >> >> A bit of background: > >> >> Apache cTAKES uses SVN for version on control: > >> >> https://svn.apache.org/repos/asf/ctakes/trunk/ > >> >> Jira for issues tracking: > >> >> https://issues.apache.org/jira/browse/ctakes > >> >> Maven for building and dependency management. > >> >> A lot of the developers use Eclipse IDE for their development. > >> >> More info on ctakes.apache.org > >> >> > >> >> cTAKES is built on top of the Apache UIMA Framework. Essentially, > >> >>cTAKES > >> >> is a collection of Annotators (Java Classes) and wired together to > >>into > >> >>a > >> >> pipeline. > >> >> It's goal in a nutshell is to turn unstructured plain text into > >> >> structured/normalized form and specially trained for medical notes. > >> >> Right now- the input cTAKES expects would be in plain text form and > >> >>cTAKES > >> >> does not have an OCR component. > >> >> cTAKE-189:GSoC:implement OCR/tika to standardize text inputs was an > >>idea > >> >> to allow cTAKES to take in any type of input (PDF, Images, Word, XLS, > >> >>etc.) > >> >> and pass the text for cTAKES processing. > >> >> [I was originally thinking this could be done in some kind of > >> >> preprocessing, or an optional Annotator that could be added in the > >> >> beginning of a pipeline]. There may be some existing work that > >>could be > >> >> potentially reused: Apache Tika ( > >> >> https://issues.apache.org/jira/browse/TIKA-93 ) as well as some open > >> >> source OCR toolkits (JavaOCR). > >> >> > >> >> About Me: > >> >> > >> >> > >> >> > >> > >> > http://childrenshospital.org/cfapps/research/data_admin/Site3240/mainpage > >> >>S3240P8.html > >> >> http://www.linkedin.com/in/peistation > >> >> http://people.apache.org/committer-index.html#chenpei > >> >> > >> >> > -----Original Message----- > >> >> > From: sandeep rg [mailto:sandeep.f...@gmail.com] > >> >> > Sent: Tuesday, July 09, 2013 1:19 PM > >> >> > To: dev@ctakes.apache.org > >> >> > Subject: Re: to involve in your development group > >> >> > > >> >> > Thanks a lot for giving me support.i like to work with you. > >> >> > > >> >> > I have gone through the objectives of the software,used the > >>software > >> >>and > >> >> > gone through various components of the project.can you provide me > >> >> starting > >> >> > point from where i should start to know more about the coding part > >>of > >> >>the > >> >> > project. > >> >> > > >> >> > can you tell me more about the project and about you also? > >> >> > > >> >> > > >> >> > On Tue, Jul 9, 2013 at 1:14 AM, Chen, Pei > >> >> > <pei.c...@childrens.harvard.edu>wrote: > >> >> > > >> >> > > Hi Sandeep, > >> >> > > Thank you for the interest. I just had a quick look at the > >>ICFOSS > >> >> > > pilot mentoring program and will be happy to serve as a mentor > >>for > >> >> > > your project > >> >> > > proposal(s) if you are interested. > >> >> > > > >> >> > > --Pei > >> >> > > > >> >> > > > -----Original Message----- > >> >> > > > From: sandeep rg [mailto:sandeep.f...@gmail.com] > >> >> > > > Sent: Monday, July 08, 2013 2:24 PM > >> >> > > > To: dev@ctakes.apache.org > >> >> > > > Subject: Re: to involve in your development group > >> >> > > > > >> >> > > > sir, > >> >> > > > > >> >> > > > details of the program Pilot mentoring programme with india > >>ICFOSS > >> >> > > > is > >> >> > > given > >> >> > > > in the below web address > >> >> > > > > >> >> > > > > >>http://community.apache.org/mentoringprogramme-icfoss-pilot.html > >> >> > > > > >> >> > > > > >> >> > > > I am new to this community so i need a mentor for the > >>project.It > >> >> > > > will be > >> >> > > more > >> >> > > > helpful for me.. > >> >> > > > > >> >> > > > > >> >> > > > On Mon, Jul 8, 2013 at 7:22 PM, Chen, Pei > >> >> > > > <pei.c...@childrens.harvard.edu>wrote: > >> >> > > > > >> >> > > > > Hi Sandeep, > >> >> > > > > Welcome! I am not familiar with the details of > >>icfoss-apache, > >> >>but > >> >> > > > > please- you are more than welcome to work on the code and > >> >> > > > > contributions will be greatly appreciated! > >> >> > > > > There may be a learning curve, but feel free let us know if > >>you > >> >> > > > > have any questions/issues. > >> >> > > > > Thanks, > >> >> > > > > Pei > >> >> > > > > > >> >> > > > > > -----Original Message----- > >> >> > > > > > From: sandeep rg [mailto:sandeep.f...@gmail.com] > >> >> > > > > > Sent: Saturday, July 06, 2013 11:50 AM > >> >> > > > > > To: dev@ctakes.apache.org > >> >> > > > > > Subject: to involve in your development group > >> >> > > > > > > >> >> > > > > > my name is sandeep.i am btech graduate.i had participated > >>in > >> >>a > >> >> > > > > > camp coordinated in kerala,India in association with > >> >> > > > > > icfoss-apache called as > >> >> > > > > youth > >> >> > > > > > mentoring programme coordinated by Luciano resende. > >> >> > > > > > > >> >> > > > > > i like the project > >>and > >> >> > > > > > like to > >> >> > > > > involve in your project as a > >> >> > > > > > programmer.i have gone through the your project and gone > >> >>through > >> >> > > > > > the bugs list.I like to work on the bug > >> >> > > > > > "cTAKE-189:GSoC:implement OCR/tika to standardize text > >>inputs > >> >> > > > > > for cTAKES".can you allow me to > >> >> > > work > >> >> > > > on that? > >> >> > > > > > >> >> > > > >> >> > >> > >> > >