RE: to involve in your development group

2013-08-09 Thread Chen, Pei
Hi Samir,
I just wanted to make sure I'm not confused.
Are you working with Sandeep on the OCR project?  
If not, would you be able to create a new email thread for it?
If Yes, could you confirm the pipeline used?  Is it the 
DefaultPlaintextAggregateUMLSPipeline; it shouldn't take hours to perform a 
lookup...?
Also, you may want to develop using the trunk branch for the ocr project so any 
new code could be easily integrated into future releases.

--Pei

 -Original Message-
 From: samir chabou [mailto:samir...@yahoo.com]
 Sent: Thursday, August 08, 2013 7:54 PM
 To: dev@ctakes.apache.org
 Subject: Re: to involve in your development group
 
 hi,
 I have an issue with ctakes 3.0 when I use the
 DictionaryLookupAnnotatorUMLS, it takes hours before getting the
 annotation for just one line of input text. I didn't have this issue with 
 ctakes
 2.5. Do you have an idea why this happening in ctakes 3.0 ?
 Thank you very much
 
 
 
 
 
  From: Mattmann, Chris A (398J) chris.a.mattm...@jpl.nasa.gov
 To: dev@ctakes.apache.org dev@ctakes.apache.org
 Sent: Wednesday, August 7, 2013 3:15:42 PM
 Subject: Re: to involve in your development group
 
 
 Awesome, great news and yes please proceed!
 
 We're here to help.
 
 Cheers,
 Chris
 
 ++
 
 Chris Mattmann, Ph.D.
 Senior Computer Scientist
 NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
 Office: 171-266B, Mailstop: 171-246
 Email: chris.a.mattm...@nasa.gov
 WWW:  http://sunset.usc.edu/~mattmann/
 ++
 
 Adjunct Assistant Professor, Computer Science Department
 University of Southern California, Los Angeles, CA 90089 USA
 ++
 
 
 
 
 
 
 
 -Original Message-
 From: Chen, Pei pei.c...@childrens.harvard.edu
 Reply-To: dev@ctakes.apache.org dev@ctakes.apache.org
 Date: Wednesday, August 7, 2013 10:22 AM
 To: dev@ctakes.apache.org dev@ctakes.apache.org
 Subject: RE: to involve in your development group
 
 Thanks Sandeep.
 Looking forward to it!  Feel free to ping us in case you have any
 questions/issues.
 --Pei
 
  -Original Message-
  From: sandeep rg [mailto:sandeep.f...@gmail.com]
  Sent: Wednesday, August 07, 2013 1:21 PM
  To: dev@ctakes.apache.org
  Subject: Re: to involve in your development group
 
  sir,
  thanks pei chen and chris Mattmann  for accepting my proposal for
  implementing ocr.i have started my work.i will try maximum to go
 according
  to the schedule.i will update my every progress to you.
 
 
  On Tue, Jul 23, 2013 at 9:26 PM, sandeep rg sandeep.f...@gmail.com
  wrote:
 
   thank you Finan sean, for your suggestion,i am now just going through
   the JAI,i think it has more features then javaocr..
  
  
  
   On Mon, Jul 22, 2013 at 10:22 PM, Mattmann, Chris A (398J) 
   chris.a.mattm...@jpl.nasa.gov wrote:
  
   Hi Sandeep,
  
   I'll try and review this today.
  
   Cheers,
   Chris
  
  
 
 ++
  
   Chris Mattmann, Ph.D.
   Senior Computer Scientist
   NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
   Office: 171-266B, Mailstop: 171-246
   Email: chris.a.mattm...@nasa.gov
   WWW:  http://sunset.usc.edu/~mattmann/
  
 
 ++
  
   Adjunct Assistant Professor, Computer Science Department University
   of Southern California, Los Angeles, CA 90089 USA
  
 
 ++
  
  
  
  
  
  
  
   -Original Message-
   From: sandeep rg sandeep.f...@gmail.com
   Reply-To: dev@ctakes.apache.org dev@ctakes.apache.org
   Date: Monday, July 22, 2013 7:04 AM
   To: dev@ctakes.apache.org dev@ctakes.apache.org
   Subject: Re: to involve in your development group
  
   sir,
i have gone through some of the medical record such as
   bills,patient details etc. most of them are printed using dot matrix
   printer,which is very hard to extract such type text from scanned
   images.i have done testing with some professional software such as
   abbyy fine reader which also
   given
   a poor output.
   
   but sir i have the confidence to do it.but i need more knowledge
   about image processing capabilities.so can you suggest any one who
   is good in image processing programming in your team?
   
   
   On Thu, Jul 18, 2013 at 1:22 AM, sandeep rg
 sandeep.f...@gmail.com
   wrote:
   
i hava done sequence diagram and done some small
 changes,please go
   through  it and tell me if any more thing is to be included
   
   
On Wed, Jul 17, 2013 at 9:37 PM, sandeep rg
   sandeep.f...@gmail.comwrote:
   
it just a skeleton of original proposal
   
   
On Wed, Jul 17, 2013 at 9:31 PM, sandeep rg
   sandeep.f...@gmail.comwrote:
   
the sample work is shared with you both.any more details to be
   included

Re: to involve in your development group

2013-08-08 Thread samir chabou
hi,
I have an issue with ctakes 3.0 when I use the DictionaryLookupAnnotatorUMLS, 
it takes hours before getting the annotation for just one line of input text. I 
didn't have this issue with ctakes 2.5. Do you have an idea why this happening 
in ctakes 3.0 ?
Thank you very much





 From: Mattmann, Chris A (398J) chris.a.mattm...@jpl.nasa.gov
To: dev@ctakes.apache.org dev@ctakes.apache.org 
Sent: Wednesday, August 7, 2013 3:15:42 PM
Subject: Re: to involve in your development group
 

Awesome, great news and yes please proceed!

We're here to help.

Cheers,
Chris

++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattm...@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++






-Original Message-
From: Chen, Pei pei.c...@childrens.harvard.edu
Reply-To: dev@ctakes.apache.org dev@ctakes.apache.org
Date: Wednesday, August 7, 2013 10:22 AM
To: dev@ctakes.apache.org dev@ctakes.apache.org
Subject: RE: to involve in your development group

Thanks Sandeep.
Looking forward to it!  Feel free to ping us in case you have any
questions/issues.
--Pei

 -Original Message-
 From: sandeep rg [mailto:sandeep.f...@gmail.com]
 Sent: Wednesday, August 07, 2013 1:21 PM
 To: dev@ctakes.apache.org
 Subject: Re: to involve in your development group
 
 sir,
 thanks pei chen and chris Mattmann  for accepting my proposal for
 implementing ocr.i have started my work.i will try maximum to go
according
 to the schedule.i will update my every progress to you.
 
 
 On Tue, Jul 23, 2013 at 9:26 PM, sandeep rg sandeep.f...@gmail.com
 wrote:
 
  thank you Finan sean, for your suggestion,i am now just going through
  the JAI,i think it has more features then javaocr..
 
 
 
  On Mon, Jul 22, 2013 at 10:22 PM, Mattmann, Chris A (398J) 
  chris.a.mattm...@jpl.nasa.gov wrote:
 
  Hi Sandeep,
 
  I'll try and review this today.
 
  Cheers,
  Chris
 
 
 ++
 
  Chris Mattmann, Ph.D.
  Senior Computer Scientist
  NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
  Office: 171-266B, Mailstop: 171-246
  Email: chris.a.mattm...@nasa.gov
  WWW:  http://sunset.usc.edu/~mattmann/
 
 ++
 
  Adjunct Assistant Professor, Computer Science Department University
  of Southern California, Los Angeles, CA 90089 USA
 
 ++
 
 
 
 
 
 
 
  -Original Message-
  From: sandeep rg sandeep.f...@gmail.com
  Reply-To: dev@ctakes.apache.org dev@ctakes.apache.org
  Date: Monday, July 22, 2013 7:04 AM
  To: dev@ctakes.apache.org dev@ctakes.apache.org
  Subject: Re: to involve in your development group
 
  sir,
   i have gone through some of the medical record such as
  bills,patient details etc. most of them are printed using dot matrix
  printer,which is very hard to extract such type text from scanned
  images.i have done testing with some professional software such as
  abbyy fine reader which also
  given
  a poor output.
  
  but sir i have the confidence to do it.but i need more knowledge
  about image processing capabilities.so can you suggest any one who
  is good in image processing programming in your team?
  
  
  On Thu, Jul 18, 2013 at 1:22 AM, sandeep rg sandeep.f...@gmail.com
  wrote:
  
   i hava done sequence diagram and done some small changes,please go
  through  it and tell me if any more thing is to be included
  
  
   On Wed, Jul 17, 2013 at 9:37 PM, sandeep rg
  sandeep.f...@gmail.comwrote:
  
   it just a skeleton of original proposal
  
  
   On Wed, Jul 17, 2013 at 9:31 PM, sandeep rg
  sandeep.f...@gmail.comwrote:
  
   the sample work is shared with you both.any more details to be
  included  please tell me.
   In which,GUI design,schedule and implementation flow chart
  design is to  added which is under construction and will be
  uploaded within few hours.
  
  
   On Wed, Jul 17, 2013 at 7:56 PM, Chen, Pei 
   pei.c...@childrens.harvard.edu wrote:
  
   pei.stat...@gmail.com
  
-Original Message-
From: Mattmann, Chris A (398J)
  [mailto:chris.a.mattm...@jpl.nasa.gov]
Sent: Wednesday, July 17, 2013 10:22 AM
To: dev@ctakes.apache.org
Subject: Re: to involve in your development group
   
chris.mattm...@gmail.com
   
   
 ++

Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email

Re: to involve in your development group

2013-08-07 Thread sandeep rg
sir,
thanks pei chen and chris Mattmann  for accepting my proposal for
implementing ocr.i have started my work.i will try maximum to go according
to the schedule.i will update my every progress to you.


On Tue, Jul 23, 2013 at 9:26 PM, sandeep rg sandeep.f...@gmail.com wrote:

 thank you Finan sean, for your suggestion,i am now just going through the
 JAI,i think it has more features then javaocr..



 On Mon, Jul 22, 2013 at 10:22 PM, Mattmann, Chris A (398J) 
 chris.a.mattm...@jpl.nasa.gov wrote:

 Hi Sandeep,

 I'll try and review this today.

 Cheers,
 Chris

 ++
 Chris Mattmann, Ph.D.
 Senior Computer Scientist
 NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
 Office: 171-266B, Mailstop: 171-246
 Email: chris.a.mattm...@nasa.gov
 WWW:  http://sunset.usc.edu/~mattmann/
 ++
 Adjunct Assistant Professor, Computer Science Department
 University of Southern California, Los Angeles, CA 90089 USA
 ++






 -Original Message-
 From: sandeep rg sandeep.f...@gmail.com
 Reply-To: dev@ctakes.apache.org dev@ctakes.apache.org
 Date: Monday, July 22, 2013 7:04 AM
 To: dev@ctakes.apache.org dev@ctakes.apache.org
 Subject: Re: to involve in your development group

 sir,
  i have gone through some of the medical record such as bills,patient
 details etc. most of them are printed using dot matrix printer,which is
 very hard to extract such type text from scanned images.i have done
 testing
 with some professional software such as abbyy fine reader which also
 given
 a poor output.
 
 but sir i have the confidence to do it.but i need more knowledge about
 image processing capabilities.so can you suggest any one who is good in
 image processing programming in your team?
 
 
 On Thu, Jul 18, 2013 at 1:22 AM, sandeep rg sandeep.f...@gmail.com
 wrote:
 
  i hava done sequence diagram and done some small changes,please go
 through
  it and tell me if any more thing is to be included
 
 
  On Wed, Jul 17, 2013 at 9:37 PM, sandeep rg
 sandeep.f...@gmail.comwrote:
 
  it just a skeleton of original proposal
 
 
  On Wed, Jul 17, 2013 at 9:31 PM, sandeep rg
 sandeep.f...@gmail.comwrote:
 
  the sample work is shared with you both.any more details to be
 included
  please tell me.
  In which,GUI design,schedule and implementation flow chart design is
 to
  added which is under construction and will be uploaded within few
 hours.
 
 
  On Wed, Jul 17, 2013 at 7:56 PM, Chen, Pei 
  pei.c...@childrens.harvard.edu wrote:
 
  pei.stat...@gmail.com
 
   -Original Message-
   From: Mattmann, Chris A (398J)
 [mailto:chris.a.mattm...@jpl.nasa.gov]
   Sent: Wednesday, July 17, 2013 10:22 AM
   To: dev@ctakes.apache.org
   Subject: Re: to involve in your development group
  
   chris.mattm...@gmail.com
  
   ++
   
   Chris Mattmann, Ph.D.
   Senior Computer Scientist
   NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
   Office: 171-266B, Mailstop: 171-246
   Email: chris.a.mattm...@nasa.gov
   WWW:  http://sunset.usc.edu/~mattmann/
   ++
   
   Adjunct Assistant Professor, Computer Science Department
 University of
   Southern California, Los Angeles, CA 90089 USA
   ++
   
  
  
  
  
  
  
   -Original Message-
   From: sandeep rg sandeep.f...@gmail.com
   Reply-To: dev@ctakes.apache.org dev@ctakes.apache.org
   Date: Wednesday, July 17, 2013 6:53 AM
   To: dev@ctakes.apache.org dev@ctakes.apache.org
   Subject: Re: to involve in your development group
  
   can you provide your gmail id to share the proposal document with
  you?
   
   
   
   On Tue, Jul 16, 2013 at 11:33 PM, sandeep rg
 sandeep.f...@gmail.com
  
   wrote:
   
sir,
i am providing proposal by two days.now i am mainly going
 through
   ASF-ICFOSS gateway because if i gone through their way and my
  proposal
   is  get selected,ICFOSS will provide some sort of support such
 as
   certificates,small financial support etc. to us.
   
   
but,main thing is i like programming,i like to explore through
 the
new technologies in coding and like to interact with the
 coding.so
  if
my proposal is got rejected,then also i like to work in your
  project
as a volunteer if you allow me..
   
now i am preparing a proposal,within 2 days i will submit
it..Mattmann chris helped me to know more about the format of
   proposal.
   
   
On Tue, Jul 16, 2013 at 8:12 PM, Chen, Pei
   pei.c...@childrens.harvard.edu
 wrote:
   
Chris/Sandeep,
According to ASF-ICFOSS, I believe the deadline for submitting
   proposals  is this coming Friday (July 19).
After which point, mentors will have 2 weeks to review and
   score/accept

RE: to involve in your development group

2013-08-07 Thread Chen, Pei
Thanks Sandeep.
Looking forward to it!  Feel free to ping us in case you have any 
questions/issues.
--Pei

 -Original Message-
 From: sandeep rg [mailto:sandeep.f...@gmail.com]
 Sent: Wednesday, August 07, 2013 1:21 PM
 To: dev@ctakes.apache.org
 Subject: Re: to involve in your development group
 
 sir,
 thanks pei chen and chris Mattmann  for accepting my proposal for
 implementing ocr.i have started my work.i will try maximum to go according
 to the schedule.i will update my every progress to you.
 
 
 On Tue, Jul 23, 2013 at 9:26 PM, sandeep rg sandeep.f...@gmail.com
 wrote:
 
  thank you Finan sean, for your suggestion,i am now just going through
  the JAI,i think it has more features then javaocr..
 
 
 
  On Mon, Jul 22, 2013 at 10:22 PM, Mattmann, Chris A (398J) 
  chris.a.mattm...@jpl.nasa.gov wrote:
 
  Hi Sandeep,
 
  I'll try and review this today.
 
  Cheers,
  Chris
 
 
 ++
 
  Chris Mattmann, Ph.D.
  Senior Computer Scientist
  NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
  Office: 171-266B, Mailstop: 171-246
  Email: chris.a.mattm...@nasa.gov
  WWW:  http://sunset.usc.edu/~mattmann/
 
 ++
 
  Adjunct Assistant Professor, Computer Science Department University
  of Southern California, Los Angeles, CA 90089 USA
 
 ++
 
 
 
 
 
 
 
  -Original Message-
  From: sandeep rg sandeep.f...@gmail.com
  Reply-To: dev@ctakes.apache.org dev@ctakes.apache.org
  Date: Monday, July 22, 2013 7:04 AM
  To: dev@ctakes.apache.org dev@ctakes.apache.org
  Subject: Re: to involve in your development group
 
  sir,
   i have gone through some of the medical record such as
  bills,patient details etc. most of them are printed using dot matrix
  printer,which is very hard to extract such type text from scanned
  images.i have done testing with some professional software such as
  abbyy fine reader which also
  given
  a poor output.
  
  but sir i have the confidence to do it.but i need more knowledge
  about image processing capabilities.so can you suggest any one who
  is good in image processing programming in your team?
  
  
  On Thu, Jul 18, 2013 at 1:22 AM, sandeep rg sandeep.f...@gmail.com
  wrote:
  
   i hava done sequence diagram and done some small changes,please go
  through  it and tell me if any more thing is to be included
  
  
   On Wed, Jul 17, 2013 at 9:37 PM, sandeep rg
  sandeep.f...@gmail.comwrote:
  
   it just a skeleton of original proposal
  
  
   On Wed, Jul 17, 2013 at 9:31 PM, sandeep rg
  sandeep.f...@gmail.comwrote:
  
   the sample work is shared with you both.any more details to be
  included  please tell me.
   In which,GUI design,schedule and implementation flow chart
  design is to  added which is under construction and will be
  uploaded within few hours.
  
  
   On Wed, Jul 17, 2013 at 7:56 PM, Chen, Pei 
   pei.c...@childrens.harvard.edu wrote:
  
   pei.stat...@gmail.com
  
-Original Message-
From: Mattmann, Chris A (398J)
  [mailto:chris.a.mattm...@jpl.nasa.gov]
Sent: Wednesday, July 17, 2013 10:22 AM
To: dev@ctakes.apache.org
Subject: Re: to involve in your development group
   
chris.mattm...@gmail.com
   
   
 ++

Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattm...@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
   
 ++

Adjunct Assistant Professor, Computer Science Department
  University of
Southern California, Los Angeles, CA 90089 USA
   
 ++

   
   
   
   
   
   
-Original Message-
From: sandeep rg sandeep.f...@gmail.com
Reply-To: dev@ctakes.apache.org dev@ctakes.apache.org
Date: Wednesday, July 17, 2013 6:53 AM
To: dev@ctakes.apache.org dev@ctakes.apache.org
Subject: Re: to involve in your development group
   
can you provide your gmail id to share the proposal document
with
   you?



On Tue, Jul 16, 2013 at 11:33 PM, sandeep rg
  sandeep.f...@gmail.com
   
wrote:

 sir,
 i am providing proposal by two days.now i am mainly going
  through
ASF-ICFOSS gateway because if i gone through their way and
my
   proposal
is  get selected,ICFOSS will provide some sort of support
such
  as
certificates,small financial support etc. to us.


 but,main thing is i like programming,i like to explore
 through
  the
 new technologies in coding and like to interact with the
  coding.so
   if
 my proposal is got rejected,then also i like to work in
 your
   project

Re: to involve in your development group

2013-08-07 Thread Mattmann, Chris A (398J)
Awesome, great news and yes please proceed!

We're here to help.

Cheers,
Chris

++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattm...@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++






-Original Message-
From: Chen, Pei pei.c...@childrens.harvard.edu
Reply-To: dev@ctakes.apache.org dev@ctakes.apache.org
Date: Wednesday, August 7, 2013 10:22 AM
To: dev@ctakes.apache.org dev@ctakes.apache.org
Subject: RE: to involve in your development group

Thanks Sandeep.
Looking forward to it!  Feel free to ping us in case you have any
questions/issues.
--Pei

 -Original Message-
 From: sandeep rg [mailto:sandeep.f...@gmail.com]
 Sent: Wednesday, August 07, 2013 1:21 PM
 To: dev@ctakes.apache.org
 Subject: Re: to involve in your development group
 
 sir,
 thanks pei chen and chris Mattmann  for accepting my proposal for
 implementing ocr.i have started my work.i will try maximum to go
according
 to the schedule.i will update my every progress to you.
 
 
 On Tue, Jul 23, 2013 at 9:26 PM, sandeep rg sandeep.f...@gmail.com
 wrote:
 
  thank you Finan sean, for your suggestion,i am now just going through
  the JAI,i think it has more features then javaocr..
 
 
 
  On Mon, Jul 22, 2013 at 10:22 PM, Mattmann, Chris A (398J) 
  chris.a.mattm...@jpl.nasa.gov wrote:
 
  Hi Sandeep,
 
  I'll try and review this today.
 
  Cheers,
  Chris
 
 
 ++
 
  Chris Mattmann, Ph.D.
  Senior Computer Scientist
  NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
  Office: 171-266B, Mailstop: 171-246
  Email: chris.a.mattm...@nasa.gov
  WWW:  http://sunset.usc.edu/~mattmann/
 
 ++
 
  Adjunct Assistant Professor, Computer Science Department University
  of Southern California, Los Angeles, CA 90089 USA
 
 ++
 
 
 
 
 
 
 
  -Original Message-
  From: sandeep rg sandeep.f...@gmail.com
  Reply-To: dev@ctakes.apache.org dev@ctakes.apache.org
  Date: Monday, July 22, 2013 7:04 AM
  To: dev@ctakes.apache.org dev@ctakes.apache.org
  Subject: Re: to involve in your development group
 
  sir,
   i have gone through some of the medical record such as
  bills,patient details etc. most of them are printed using dot matrix
  printer,which is very hard to extract such type text from scanned
  images.i have done testing with some professional software such as
  abbyy fine reader which also
  given
  a poor output.
  
  but sir i have the confidence to do it.but i need more knowledge
  about image processing capabilities.so can you suggest any one who
  is good in image processing programming in your team?
  
  
  On Thu, Jul 18, 2013 at 1:22 AM, sandeep rg sandeep.f...@gmail.com
  wrote:
  
   i hava done sequence diagram and done some small changes,please go
  through  it and tell me if any more thing is to be included
  
  
   On Wed, Jul 17, 2013 at 9:37 PM, sandeep rg
  sandeep.f...@gmail.comwrote:
  
   it just a skeleton of original proposal
  
  
   On Wed, Jul 17, 2013 at 9:31 PM, sandeep rg
  sandeep.f...@gmail.comwrote:
  
   the sample work is shared with you both.any more details to be
  included  please tell me.
   In which,GUI design,schedule and implementation flow chart
  design is to  added which is under construction and will be
  uploaded within few hours.
  
  
   On Wed, Jul 17, 2013 at 7:56 PM, Chen, Pei 
   pei.c...@childrens.harvard.edu wrote:
  
   pei.stat...@gmail.com
  
-Original Message-
From: Mattmann, Chris A (398J)
  [mailto:chris.a.mattm...@jpl.nasa.gov]
Sent: Wednesday, July 17, 2013 10:22 AM
To: dev@ctakes.apache.org
Subject: Re: to involve in your development group
   
chris.mattm...@gmail.com
   
   
 ++

Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattm...@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
   
 ++

Adjunct Assistant Professor, Computer Science Department
  University of
Southern California, Los Angeles, CA 90089 USA
   
 ++

   
   
   
   
   
   
-Original Message-
From: sandeep rg sandeep.f...@gmail.com
Reply-To: dev@ctakes.apache.org dev@ctakes.apache.org
Date

Re: to involve in your development group

2013-07-23 Thread sandeep rg
thank you Finan sean, for your suggestion,i am now just going through the
JAI,i think it has more features then javaocr..



On Mon, Jul 22, 2013 at 10:22 PM, Mattmann, Chris A (398J) 
chris.a.mattm...@jpl.nasa.gov wrote:

 Hi Sandeep,

 I'll try and review this today.

 Cheers,
 Chris

 ++
 Chris Mattmann, Ph.D.
 Senior Computer Scientist
 NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
 Office: 171-266B, Mailstop: 171-246
 Email: chris.a.mattm...@nasa.gov
 WWW:  http://sunset.usc.edu/~mattmann/
 ++
 Adjunct Assistant Professor, Computer Science Department
 University of Southern California, Los Angeles, CA 90089 USA
 ++






 -Original Message-
 From: sandeep rg sandeep.f...@gmail.com
 Reply-To: dev@ctakes.apache.org dev@ctakes.apache.org
 Date: Monday, July 22, 2013 7:04 AM
 To: dev@ctakes.apache.org dev@ctakes.apache.org
 Subject: Re: to involve in your development group

 sir,
  i have gone through some of the medical record such as bills,patient
 details etc. most of them are printed using dot matrix printer,which is
 very hard to extract such type text from scanned images.i have done
 testing
 with some professional software such as abbyy fine reader which also given
 a poor output.
 
 but sir i have the confidence to do it.but i need more knowledge about
 image processing capabilities.so can you suggest any one who is good in
 image processing programming in your team?
 
 
 On Thu, Jul 18, 2013 at 1:22 AM, sandeep rg sandeep.f...@gmail.com
 wrote:
 
  i hava done sequence diagram and done some small changes,please go
 through
  it and tell me if any more thing is to be included
 
 
  On Wed, Jul 17, 2013 at 9:37 PM, sandeep rg
 sandeep.f...@gmail.comwrote:
 
  it just a skeleton of original proposal
 
 
  On Wed, Jul 17, 2013 at 9:31 PM, sandeep rg
 sandeep.f...@gmail.comwrote:
 
  the sample work is shared with you both.any more details to be
 included
  please tell me.
  In which,GUI design,schedule and implementation flow chart design is
 to
  added which is under construction and will be uploaded within few
 hours.
 
 
  On Wed, Jul 17, 2013 at 7:56 PM, Chen, Pei 
  pei.c...@childrens.harvard.edu wrote:
 
  pei.stat...@gmail.com
 
   -Original Message-
   From: Mattmann, Chris A (398J)
 [mailto:chris.a.mattm...@jpl.nasa.gov]
   Sent: Wednesday, July 17, 2013 10:22 AM
   To: dev@ctakes.apache.org
   Subject: Re: to involve in your development group
  
   chris.mattm...@gmail.com
  
   ++
   
   Chris Mattmann, Ph.D.
   Senior Computer Scientist
   NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
   Office: 171-266B, Mailstop: 171-246
   Email: chris.a.mattm...@nasa.gov
   WWW:  http://sunset.usc.edu/~mattmann/
   ++
   
   Adjunct Assistant Professor, Computer Science Department
 University of
   Southern California, Los Angeles, CA 90089 USA
   ++
   
  
  
  
  
  
  
   -Original Message-
   From: sandeep rg sandeep.f...@gmail.com
   Reply-To: dev@ctakes.apache.org dev@ctakes.apache.org
   Date: Wednesday, July 17, 2013 6:53 AM
   To: dev@ctakes.apache.org dev@ctakes.apache.org
   Subject: Re: to involve in your development group
  
   can you provide your gmail id to share the proposal document with
  you?
   
   
   
   On Tue, Jul 16, 2013 at 11:33 PM, sandeep rg
 sandeep.f...@gmail.com
  
   wrote:
   
sir,
i am providing proposal by two days.now i am mainly going
 through
   ASF-ICFOSS gateway because if i gone through their way and my
  proposal
   is  get selected,ICFOSS will provide some sort of support such as
   certificates,small financial support etc. to us.
   
   
but,main thing is i like programming,i like to explore through
 the
new technologies in coding and like to interact with the
 coding.so
  if
my proposal is got rejected,then also i like to work in your
  project
as a volunteer if you allow me..
   
now i am preparing a proposal,within 2 days i will submit
it..Mattmann chris helped me to know more about the format of
   proposal.
   
   
On Tue, Jul 16, 2013 at 8:12 PM, Chen, Pei
   pei.c...@childrens.harvard.edu
 wrote:
   
Chris/Sandeep,
According to ASF-ICFOSS, I believe the deadline for submitting
   proposals  is this coming Friday (July 19).
After which point, mentors will have 2 weeks to review and
   score/accept.
Just curious, are we planning to follow the same process here?
  Or
   since  it's all volunteer work, technically- sandeep and still
   contribute code to  the community and participate in the dev
 group
   here.
   
Looking forward to it.
--Pei
   
   
 -Original

RE: to involve in your development group

2013-07-22 Thread Finan, Sean
Hi Sandeep,

I just took a peek at the JavaOcr code, and it looks like they perform image 
filtering in the PixelImage class.  This would probably cause a problem with 
dot matrix images as every corner of every dot would be removed as noise, so 
dots that participate in curves on characters such as P would be removed to 
form something more like |'.  In fact, depending upon the spacing between 
matrix dots and the resolution of the scan, the filter could decrease the size 
of each dot, making it very difficult for the ocr to work at all.

Assuming that you have already tried to train the software using your dot 
matrix printings, you could change JavaOcr to use java advanced imaging (jai).  
You would then use the jai Raster class instead of the javaocr PixelImage class 
for image manipulation.  There are a lot of things that you could do from that 
point forward.

Just giving you my initial thought,

Sean

-Original Message-
From: sandeep rg [mailto:sandeep.f...@gmail.com] 
Sent: Monday, July 22, 2013 10:06 AM
To: dev@ctakes.apache.org
Subject: Re: to involve in your development group

sir,
 i have gone through some of the medical record such as bills,patient
details etc. most of them are printed using dot matrix printer,which is
very hard to extract such type text from scanned images.i have done testing
with some professional software such as abbyy fine reader which also given
a poor output.

but sir i have the confidence to do it.but i need more knowledge about
image processing capabilities.so can you suggest any one who is good in
image processing programming in your team?


On Thu, Jul 18, 2013 at 1:22 AM, sandeep rg sandeep.f...@gmail.com wrote:

 i hava done sequence diagram and done some small changes,please go through
 it and tell me if any more thing is to be included


 On Wed, Jul 17, 2013 at 9:37 PM, sandeep rg sandeep.f...@gmail.comwrote:

 it just a skeleton of original proposal


 On Wed, Jul 17, 2013 at 9:31 PM, sandeep rg sandeep.f...@gmail.comwrote:

 the sample work is shared with you both.any more details to be included
 please tell me.
 In which,GUI design,schedule and implementation flow chart design is to
 added which is under construction and will be uploaded within few hours.


 On Wed, Jul 17, 2013 at 7:56 PM, Chen, Pei 
 pei.c...@childrens.harvard.edu wrote:

 pei.stat...@gmail.com

  -Original Message-
  From: Mattmann, Chris A (398J) [mailto:chris.a.mattm...@jpl.nasa.gov]
  Sent: Wednesday, July 17, 2013 10:22 AM
  To: dev@ctakes.apache.org
  Subject: Re: to involve in your development group
 
  chris.mattm...@gmail.com
 
  ++
  
  Chris Mattmann, Ph.D.
  Senior Computer Scientist
  NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
  Office: 171-266B, Mailstop: 171-246
  Email: chris.a.mattm...@nasa.gov
  WWW:  http://sunset.usc.edu/~mattmann/
  ++
  
  Adjunct Assistant Professor, Computer Science Department University of
  Southern California, Los Angeles, CA 90089 USA
  ++
  
 
 
 
 
 
 
  -Original Message-
  From: sandeep rg sandeep.f...@gmail.com
  Reply-To: dev@ctakes.apache.org dev@ctakes.apache.org
  Date: Wednesday, July 17, 2013 6:53 AM
  To: dev@ctakes.apache.org dev@ctakes.apache.org
  Subject: Re: to involve in your development group
 
  can you provide your gmail id to share the proposal document with
 you?
  
  
  
  On Tue, Jul 16, 2013 at 11:33 PM, sandeep rg sandeep.f...@gmail.com
 
  wrote:
  
   sir,
   i am providing proposal by two days.now i am mainly going through
  ASF-ICFOSS gateway because if i gone through their way and my
 proposal
  is  get selected,ICFOSS will provide some sort of support such as
  certificates,small financial support etc. to us.
  
  
   but,main thing is i like programming,i like to explore through the
   new technologies in coding and like to interact with the coding.so
 if
   my proposal is got rejected,then also i like to work in your
 project
   as a volunteer if you allow me..
  
   now i am preparing a proposal,within 2 days i will submit
   it..Mattmann chris helped me to know more about the format of
  proposal.
  
  
   On Tue, Jul 16, 2013 at 8:12 PM, Chen, Pei
  pei.c...@childrens.harvard.edu
wrote:
  
   Chris/Sandeep,
   According to ASF-ICFOSS, I believe the deadline for submitting
  proposals  is this coming Friday (July 19).
   After which point, mentors will have 2 weeks to review and
  score/accept.
   Just curious, are we planning to follow the same process here?  Or
  since  it's all volunteer work, technically- sandeep and still
  contribute code to  the community and participate in the dev group
  here.
  
   Looking forward to it.
   --Pei
  
  
-Original Message-
From: sandeep rg [mailto:sandeep.f...@gmail.com]
Sent: Monday, July 15, 2013 1:05 PM
To: dev

Re: to involve in your development group

2013-07-22 Thread Mattmann, Chris A (398J)
Hi Sandeep,

I'll try and review this today.

Cheers,
Chris

++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattm...@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++






-Original Message-
From: sandeep rg sandeep.f...@gmail.com
Reply-To: dev@ctakes.apache.org dev@ctakes.apache.org
Date: Monday, July 22, 2013 7:04 AM
To: dev@ctakes.apache.org dev@ctakes.apache.org
Subject: Re: to involve in your development group

sir,
 i have gone through some of the medical record such as bills,patient
details etc. most of them are printed using dot matrix printer,which is
very hard to extract such type text from scanned images.i have done
testing
with some professional software such as abbyy fine reader which also given
a poor output.

but sir i have the confidence to do it.but i need more knowledge about
image processing capabilities.so can you suggest any one who is good in
image processing programming in your team?


On Thu, Jul 18, 2013 at 1:22 AM, sandeep rg sandeep.f...@gmail.com
wrote:

 i hava done sequence diagram and done some small changes,please go
through
 it and tell me if any more thing is to be included


 On Wed, Jul 17, 2013 at 9:37 PM, sandeep rg
sandeep.f...@gmail.comwrote:

 it just a skeleton of original proposal


 On Wed, Jul 17, 2013 at 9:31 PM, sandeep rg
sandeep.f...@gmail.comwrote:

 the sample work is shared with you both.any more details to be
included
 please tell me.
 In which,GUI design,schedule and implementation flow chart design is
to
 added which is under construction and will be uploaded within few
hours.


 On Wed, Jul 17, 2013 at 7:56 PM, Chen, Pei 
 pei.c...@childrens.harvard.edu wrote:

 pei.stat...@gmail.com

  -Original Message-
  From: Mattmann, Chris A (398J)
[mailto:chris.a.mattm...@jpl.nasa.gov]
  Sent: Wednesday, July 17, 2013 10:22 AM
  To: dev@ctakes.apache.org
  Subject: Re: to involve in your development group
 
  chris.mattm...@gmail.com
 
  ++
  
  Chris Mattmann, Ph.D.
  Senior Computer Scientist
  NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
  Office: 171-266B, Mailstop: 171-246
  Email: chris.a.mattm...@nasa.gov
  WWW:  http://sunset.usc.edu/~mattmann/
  ++
  
  Adjunct Assistant Professor, Computer Science Department
University of
  Southern California, Los Angeles, CA 90089 USA
  ++
  
 
 
 
 
 
 
  -Original Message-
  From: sandeep rg sandeep.f...@gmail.com
  Reply-To: dev@ctakes.apache.org dev@ctakes.apache.org
  Date: Wednesday, July 17, 2013 6:53 AM
  To: dev@ctakes.apache.org dev@ctakes.apache.org
  Subject: Re: to involve in your development group
 
  can you provide your gmail id to share the proposal document with
 you?
  
  
  
  On Tue, Jul 16, 2013 at 11:33 PM, sandeep rg
sandeep.f...@gmail.com
 
  wrote:
  
   sir,
   i am providing proposal by two days.now i am mainly going
through
  ASF-ICFOSS gateway because if i gone through their way and my
 proposal
  is  get selected,ICFOSS will provide some sort of support such as
  certificates,small financial support etc. to us.
  
  
   but,main thing is i like programming,i like to explore through
the
   new technologies in coding and like to interact with the
coding.so
 if
   my proposal is got rejected,then also i like to work in your
 project
   as a volunteer if you allow me..
  
   now i am preparing a proposal,within 2 days i will submit
   it..Mattmann chris helped me to know more about the format of
  proposal.
  
  
   On Tue, Jul 16, 2013 at 8:12 PM, Chen, Pei
  pei.c...@childrens.harvard.edu
wrote:
  
   Chris/Sandeep,
   According to ASF-ICFOSS, I believe the deadline for submitting
  proposals  is this coming Friday (July 19).
   After which point, mentors will have 2 weeks to review and
  score/accept.
   Just curious, are we planning to follow the same process here?
 Or
  since  it's all volunteer work, technically- sandeep and still
  contribute code to  the community and participate in the dev
group
  here.
  
   Looking forward to it.
   --Pei
  
  
-Original Message-
From: sandeep rg [mailto:sandeep.f...@gmail.com]
Sent: Monday, July 15, 2013 1:05 PM
To: dev@ctakes.apache.org
Subject: Re: to involve in your development group
   
sir,
i gone through most of the ocr technologies and reached a
  conclusion.i
would like to use apache tika and java ocr for this pupose.
   
Tessearact is a ocr tool

Re: to involve in your development group

2013-07-17 Thread sandeep rg
can you provide your gmail id to share the proposal document with you?



On Tue, Jul 16, 2013 at 11:33 PM, sandeep rg sandeep.f...@gmail.com wrote:

 sir,
 i am providing proposal by two days.now i am mainly going through
 ASF-ICFOSS gateway because if i gone through their way and my proposal is
 get selected,ICFOSS will provide some sort of support such as
 certificates,small financial support etc. to us.


 but,main thing is i like programming,i like to explore through the new
 technologies in coding and like to interact with the coding.so if my
 proposal is got rejected,then also i like to work in your project as a
 volunteer if you allow me..

 now i am preparing a proposal,within 2 days i will submit it..Mattmann
 chris helped me to know more about the format of proposal.


 On Tue, Jul 16, 2013 at 8:12 PM, Chen, Pei pei.c...@childrens.harvard.edu
  wrote:

 Chris/Sandeep,
 According to ASF-ICFOSS, I believe the deadline for submitting proposals
 is this coming Friday (July 19).
 After which point, mentors will have 2 weeks to review and score/accept.
 Just curious, are we planning to follow the same process here?  Or since
 it's all volunteer work, technically- sandeep and still contribute code to
 the community and participate in the dev group here.

 Looking forward to it.
 --Pei


  -Original Message-
  From: sandeep rg [mailto:sandeep.f...@gmail.com]
  Sent: Monday, July 15, 2013 1:05 PM
  To: dev@ctakes.apache.org
  Subject: Re: to involve in your development group
 
  sir,
  i gone through most of the ocr technologies and reached a conclusion.i
  would like to use apache tika and java ocr for this pupose.
 
  Tessearact is a ocr tool,it can be used for extracting from multiple
  languages.it is implemented in vc++.so it can acceded using java native
  function.they provided another  tool tess4j but review says that it has
  many bugs.
 
  Apache tika developed in java language.it can be used to extract text
 data
  from .xls,word,txt,pdf and other many formats.it is easy for
 implementing
  in project also.i have just gone through its implementation way.
 
  then about javaocr,its good for extrating text from a jpeg or scanned
  images.we can train it with various fonts.more we train more will be its
  accuracy but its speed will get decreased.i didn't find any particular
  documentation for that.
 
 
 
  On Sun, Jul 14, 2013 at 9:18 PM, sandeep rg sandeep.f...@gmail.com
  wrote:
 
   thanks a lot for both of your support.I will do my best to find
 solution
   for jira problem.i will share the proposal with both of you..
  
  
  
   On Sun, Jul 14, 2013 at 1:46 AM, Chen, Pei
  pei.c...@childrens.harvard.edu
wrote:
  
   Sandeep,
   Its great to have Chris on board as well- he was one of the
 coordinators
   of GSoC.
   Looking forward to it.
  
   Sent from my iPhone
  
   On Jul 13, 2013, at 12:24 PM, Mattmann, Chris A (398J) 
   chris.a.mattm...@jpl.nasa.gov wrote:
  
Hi Sandeep,
   
That is great news, and good job. OK, for some ideas about
 developing
your proposal, you may want to simply start with a Google Docs, and
  then
share it with Pei. I'd be happy to help co-mentor if Pei and you
 think
it's useful too.
   
Your proposal should likely cover:
   
1. Background - what's the state of CTAKES-189 and what's it
 trying to
accomplish
 (include some figures, etc. along with your text)
   
2. Approach - what are you going to do to solve CTAKES-189. Be
 specific,
and
 try to break it down into smaller, easily reversible steps
   
3. Schedule - how long and what is the schedule for achieving this?
   
4. Risks/etc. - any known risks like are you taking a vacation
 anytime
soon :)
 or are there other time constraints?
   
5. References, etc.
   
HTH and I'd be happy if you want to share the GDocs with me as you
   develop
it.
   
Cheers!
   
Chris
   
   
  ++
  
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattm...@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
   
  ++
  
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
   
  ++
  
   
   
   
   
   
   
-Original Message-
From: sandeep rg sandeep.f...@gmail.com
Reply-To: dev@ctakes.apache.org dev@ctakes.apache.org
Date: Saturday, July 13, 2013 8:57 AM
To: dev@ctakes.apache.org dev@ctakes.apache.org
Subject: Re: to involve in your development group
   
i have also gone through the technologies available for
 development
  of
ocr,from that i think apache tika and tessearact is best for
 resolving

Re: to involve in your development group

2013-07-17 Thread Mattmann, Chris A (398J)
chris.mattm...@gmail.com

++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattm...@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++






-Original Message-
From: sandeep rg sandeep.f...@gmail.com
Reply-To: dev@ctakes.apache.org dev@ctakes.apache.org
Date: Wednesday, July 17, 2013 6:53 AM
To: dev@ctakes.apache.org dev@ctakes.apache.org
Subject: Re: to involve in your development group

can you provide your gmail id to share the proposal document with you?



On Tue, Jul 16, 2013 at 11:33 PM, sandeep rg sandeep.f...@gmail.com
wrote:

 sir,
 i am providing proposal by two days.now i am mainly going through
 ASF-ICFOSS gateway because if i gone through their way and my proposal
is
 get selected,ICFOSS will provide some sort of support such as
 certificates,small financial support etc. to us.


 but,main thing is i like programming,i like to explore through the new
 technologies in coding and like to interact with the coding.so if my
 proposal is got rejected,then also i like to work in your project as a
 volunteer if you allow me..

 now i am preparing a proposal,within 2 days i will submit it..Mattmann
 chris helped me to know more about the format of proposal.


 On Tue, Jul 16, 2013 at 8:12 PM, Chen, Pei
pei.c...@childrens.harvard.edu
  wrote:

 Chris/Sandeep,
 According to ASF-ICFOSS, I believe the deadline for submitting
proposals
 is this coming Friday (July 19).
 After which point, mentors will have 2 weeks to review and
score/accept.
 Just curious, are we planning to follow the same process here?  Or
since
 it's all volunteer work, technically- sandeep and still contribute
code to
 the community and participate in the dev group here.

 Looking forward to it.
 --Pei


  -Original Message-
  From: sandeep rg [mailto:sandeep.f...@gmail.com]
  Sent: Monday, July 15, 2013 1:05 PM
  To: dev@ctakes.apache.org
  Subject: Re: to involve in your development group
 
  sir,
  i gone through most of the ocr technologies and reached a
conclusion.i
  would like to use apache tika and java ocr for this pupose.
 
  Tessearact is a ocr tool,it can be used for extracting from multiple
  languages.it is implemented in vc++.so it can acceded using java
native
  function.they provided another  tool tess4j but review says that it
has
  many bugs.
 
  Apache tika developed in java language.it can be used to extract text
 data
  from .xls,word,txt,pdf and other many formats.it is easy for
 implementing
  in project also.i have just gone through its implementation way.
 
  then about javaocr,its good for extrating text from a jpeg or scanned
  images.we can train it with various fonts.more we train more will be
its
  accuracy but its speed will get decreased.i didn't find any
particular
  documentation for that.
 
 
 
  On Sun, Jul 14, 2013 at 9:18 PM, sandeep rg sandeep.f...@gmail.com
  wrote:
 
   thanks a lot for both of your support.I will do my best to find
 solution
   for jira problem.i will share the proposal with both of you..
  
  
  
   On Sun, Jul 14, 2013 at 1:46 AM, Chen, Pei
  pei.c...@childrens.harvard.edu
wrote:
  
   Sandeep,
   Its great to have Chris on board as well- he was one of the
 coordinators
   of GSoC.
   Looking forward to it.
  
   Sent from my iPhone
  
   On Jul 13, 2013, at 12:24 PM, Mattmann, Chris A (398J) 
   chris.a.mattm...@jpl.nasa.gov wrote:
  
Hi Sandeep,
   
That is great news, and good job. OK, for some ideas about
 developing
your proposal, you may want to simply start with a Google Docs,
and
  then
share it with Pei. I'd be happy to help co-mentor if Pei and you
 think
it's useful too.
   
Your proposal should likely cover:
   
1. Background - what's the state of CTAKES-189 and what's it
 trying to
accomplish
 (include some figures, etc. along with your text)
   
2. Approach - what are you going to do to solve CTAKES-189. Be
 specific,
and
 try to break it down into smaller, easily reversible steps
   
3. Schedule - how long and what is the schedule for achieving
this?
   
4. Risks/etc. - any known risks like are you taking a vacation
 anytime
soon :)
 or are there other time constraints?
   
5. References, etc.
   
HTH and I'd be happy if you want to share the GDocs with me as
you
   develop
it.
   
Cheers!
   
Chris
   
   
  ++
  
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email

RE: to involve in your development group

2013-07-17 Thread Chen, Pei
pei.stat...@gmail.com

 -Original Message-
 From: Mattmann, Chris A (398J) [mailto:chris.a.mattm...@jpl.nasa.gov]
 Sent: Wednesday, July 17, 2013 10:22 AM
 To: dev@ctakes.apache.org
 Subject: Re: to involve in your development group
 
 chris.mattm...@gmail.com
 
 ++
 
 Chris Mattmann, Ph.D.
 Senior Computer Scientist
 NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
 Office: 171-266B, Mailstop: 171-246
 Email: chris.a.mattm...@nasa.gov
 WWW:  http://sunset.usc.edu/~mattmann/
 ++
 
 Adjunct Assistant Professor, Computer Science Department University of
 Southern California, Los Angeles, CA 90089 USA
 ++
 
 
 
 
 
 
 
 -Original Message-
 From: sandeep rg sandeep.f...@gmail.com
 Reply-To: dev@ctakes.apache.org dev@ctakes.apache.org
 Date: Wednesday, July 17, 2013 6:53 AM
 To: dev@ctakes.apache.org dev@ctakes.apache.org
 Subject: Re: to involve in your development group
 
 can you provide your gmail id to share the proposal document with you?
 
 
 
 On Tue, Jul 16, 2013 at 11:33 PM, sandeep rg sandeep.f...@gmail.com
 wrote:
 
  sir,
  i am providing proposal by two days.now i am mainly going through
 ASF-ICFOSS gateway because if i gone through their way and my proposal
 is  get selected,ICFOSS will provide some sort of support such as
 certificates,small financial support etc. to us.
 
 
  but,main thing is i like programming,i like to explore through the
  new technologies in coding and like to interact with the coding.so if
  my proposal is got rejected,then also i like to work in your project
  as a volunteer if you allow me..
 
  now i am preparing a proposal,within 2 days i will submit
  it..Mattmann chris helped me to know more about the format of
 proposal.
 
 
  On Tue, Jul 16, 2013 at 8:12 PM, Chen, Pei
 pei.c...@childrens.harvard.edu
   wrote:
 
  Chris/Sandeep,
  According to ASF-ICFOSS, I believe the deadline for submitting
 proposals  is this coming Friday (July 19).
  After which point, mentors will have 2 weeks to review and
 score/accept.
  Just curious, are we planning to follow the same process here?  Or
 since  it's all volunteer work, technically- sandeep and still
 contribute code to  the community and participate in the dev group
 here.
 
  Looking forward to it.
  --Pei
 
 
   -Original Message-
   From: sandeep rg [mailto:sandeep.f...@gmail.com]
   Sent: Monday, July 15, 2013 1:05 PM
   To: dev@ctakes.apache.org
   Subject: Re: to involve in your development group
  
   sir,
   i gone through most of the ocr technologies and reached a
 conclusion.i
   would like to use apache tika and java ocr for this pupose.
  
   Tessearact is a ocr tool,it can be used for extracting from
   multiple languages.it is implemented in vc++.so it can acceded
   using java
 native
   function.they provided another  tool tess4j but review says that
   it
 has
   many bugs.
  
   Apache tika developed in java language.it can be used to extract
   text
  data
   from .xls,word,txt,pdf and other many formats.it is easy for
  implementing
   in project also.i have just gone through its implementation way.
  
   then about javaocr,its good for extrating text from a jpeg or
   scanned images.we can train it with various fonts.more we train
   more will be
 its
   accuracy but its speed will get decreased.i didn't find any
 particular
   documentation for that.
  
  
  
   On Sun, Jul 14, 2013 at 9:18 PM, sandeep rg
   sandeep.f...@gmail.com
   wrote:
  
thanks a lot for both of your support.I will do my best to find
  solution
for jira problem.i will share the proposal with both of you..
   
   
   
On Sun, Jul 14, 2013 at 1:46 AM, Chen, Pei
   pei.c...@childrens.harvard.edu
 wrote:
   
Sandeep,
Its great to have Chris on board as well- he was one of the
  coordinators
of GSoC.
Looking forward to it.
   
Sent from my iPhone
   
On Jul 13, 2013, at 12:24 PM, Mattmann, Chris A (398J) 
chris.a.mattm...@jpl.nasa.gov wrote:
   
 Hi Sandeep,

 That is great news, and good job. OK, for some ideas about
  developing
 your proposal, you may want to simply start with a Google
 Docs,
 and
   then
 share it with Pei. I'd be happy to help co-mentor if Pei and
 you
  think
 it's useful too.

 Your proposal should likely cover:

 1. Background - what's the state of CTAKES-189 and what's it
  trying to
 accomplish
  (include some figures, etc. along with your text)

 2. Approach - what are you going to do to solve CTAKES-189.
 Be
  specific,
 and
  try to break it down into smaller, easily reversible steps

 3. Schedule - how long and what is the schedule for achieving
 this?

 4. Risks/etc. - any known risks like are you taking a
 vacation
  anytime
 soon

Re: to involve in your development group

2013-07-17 Thread sandeep rg
the sample work is shared with you both.any more details to be included
please tell me.
In which,GUI design,schedule and implementation flow chart design is to
added which is under construction and will be uploaded within few hours.


On Wed, Jul 17, 2013 at 7:56 PM, Chen, Pei
pei.c...@childrens.harvard.eduwrote:

 pei.stat...@gmail.com

  -Original Message-
  From: Mattmann, Chris A (398J) [mailto:chris.a.mattm...@jpl.nasa.gov]
  Sent: Wednesday, July 17, 2013 10:22 AM
  To: dev@ctakes.apache.org
  Subject: Re: to involve in your development group
 
  chris.mattm...@gmail.com
 
  ++
  
  Chris Mattmann, Ph.D.
  Senior Computer Scientist
  NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
  Office: 171-266B, Mailstop: 171-246
  Email: chris.a.mattm...@nasa.gov
  WWW:  http://sunset.usc.edu/~mattmann/
  ++
  
  Adjunct Assistant Professor, Computer Science Department University of
  Southern California, Los Angeles, CA 90089 USA
  ++
  
 
 
 
 
 
 
  -Original Message-
  From: sandeep rg sandeep.f...@gmail.com
  Reply-To: dev@ctakes.apache.org dev@ctakes.apache.org
  Date: Wednesday, July 17, 2013 6:53 AM
  To: dev@ctakes.apache.org dev@ctakes.apache.org
  Subject: Re: to involve in your development group
 
  can you provide your gmail id to share the proposal document with you?
  
  
  
  On Tue, Jul 16, 2013 at 11:33 PM, sandeep rg sandeep.f...@gmail.com
  wrote:
  
   sir,
   i am providing proposal by two days.now i am mainly going through
  ASF-ICFOSS gateway because if i gone through their way and my proposal
  is  get selected,ICFOSS will provide some sort of support such as
  certificates,small financial support etc. to us.
  
  
   but,main thing is i like programming,i like to explore through the
   new technologies in coding and like to interact with the coding.so if
   my proposal is got rejected,then also i like to work in your project
   as a volunteer if you allow me..
  
   now i am preparing a proposal,within 2 days i will submit
   it..Mattmann chris helped me to know more about the format of
  proposal.
  
  
   On Tue, Jul 16, 2013 at 8:12 PM, Chen, Pei
  pei.c...@childrens.harvard.edu
wrote:
  
   Chris/Sandeep,
   According to ASF-ICFOSS, I believe the deadline for submitting
  proposals  is this coming Friday (July 19).
   After which point, mentors will have 2 weeks to review and
  score/accept.
   Just curious, are we planning to follow the same process here?  Or
  since  it's all volunteer work, technically- sandeep and still
  contribute code to  the community and participate in the dev group
  here.
  
   Looking forward to it.
   --Pei
  
  
-Original Message-
From: sandeep rg [mailto:sandeep.f...@gmail.com]
Sent: Monday, July 15, 2013 1:05 PM
To: dev@ctakes.apache.org
Subject: Re: to involve in your development group
   
sir,
i gone through most of the ocr technologies and reached a
  conclusion.i
would like to use apache tika and java ocr for this pupose.
   
Tessearact is a ocr tool,it can be used for extracting from
multiple languages.it is implemented in vc++.so it can acceded
using java
  native
function.they provided another  tool tess4j but review says that
it
  has
many bugs.
   
Apache tika developed in java language.it can be used to extract
text
   data
from .xls,word,txt,pdf and other many formats.it is easy for
   implementing
in project also.i have just gone through its implementation way.
   
then about javaocr,its good for extrating text from a jpeg or
scanned images.we can train it with various fonts.more we train
more will be
  its
accuracy but its speed will get decreased.i didn't find any
  particular
documentation for that.
   
   
   
On Sun, Jul 14, 2013 at 9:18 PM, sandeep rg
sandeep.f...@gmail.com
wrote:
   
 thanks a lot for both of your support.I will do my best to find
   solution
 for jira problem.i will share the proposal with both of you..



 On Sun, Jul 14, 2013 at 1:46 AM, Chen, Pei
pei.c...@childrens.harvard.edu
  wrote:

 Sandeep,
 Its great to have Chris on board as well- he was one of the
   coordinators
 of GSoC.
 Looking forward to it.

 Sent from my iPhone

 On Jul 13, 2013, at 12:24 PM, Mattmann, Chris A (398J) 
 chris.a.mattm...@jpl.nasa.gov wrote:

  Hi Sandeep,
 
  That is great news, and good job. OK, for some ideas about
   developing
  your proposal, you may want to simply start with a Google
  Docs,
  and
then
  share it with Pei. I'd be happy to help co-mentor if Pei and
  you
   think
  it's useful too.
 
  Your proposal should likely cover:
 
  1. Background - what's the state

Re: to involve in your development group

2013-07-17 Thread sandeep rg
it just a skeleton of original proposal


On Wed, Jul 17, 2013 at 9:31 PM, sandeep rg sandeep.f...@gmail.com wrote:

 the sample work is shared with you both.any more details to be included
 please tell me.
 In which,GUI design,schedule and implementation flow chart design is to
 added which is under construction and will be uploaded within few hours.


 On Wed, Jul 17, 2013 at 7:56 PM, Chen, Pei pei.c...@childrens.harvard.edu
  wrote:

 pei.stat...@gmail.com

  -Original Message-
  From: Mattmann, Chris A (398J) [mailto:chris.a.mattm...@jpl.nasa.gov]
  Sent: Wednesday, July 17, 2013 10:22 AM
  To: dev@ctakes.apache.org
  Subject: Re: to involve in your development group
 
  chris.mattm...@gmail.com
 
  ++
  
  Chris Mattmann, Ph.D.
  Senior Computer Scientist
  NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
  Office: 171-266B, Mailstop: 171-246
  Email: chris.a.mattm...@nasa.gov
  WWW:  http://sunset.usc.edu/~mattmann/
  ++
  
  Adjunct Assistant Professor, Computer Science Department University of
  Southern California, Los Angeles, CA 90089 USA
  ++
  
 
 
 
 
 
 
  -Original Message-
  From: sandeep rg sandeep.f...@gmail.com
  Reply-To: dev@ctakes.apache.org dev@ctakes.apache.org
  Date: Wednesday, July 17, 2013 6:53 AM
  To: dev@ctakes.apache.org dev@ctakes.apache.org
  Subject: Re: to involve in your development group
 
  can you provide your gmail id to share the proposal document with you?
  
  
  
  On Tue, Jul 16, 2013 at 11:33 PM, sandeep rg sandeep.f...@gmail.com
  wrote:
  
   sir,
   i am providing proposal by two days.now i am mainly going through
  ASF-ICFOSS gateway because if i gone through their way and my proposal
  is  get selected,ICFOSS will provide some sort of support such as
  certificates,small financial support etc. to us.
  
  
   but,main thing is i like programming,i like to explore through the
   new technologies in coding and like to interact with the coding.so if
   my proposal is got rejected,then also i like to work in your project
   as a volunteer if you allow me..
  
   now i am preparing a proposal,within 2 days i will submit
   it..Mattmann chris helped me to know more about the format of
  proposal.
  
  
   On Tue, Jul 16, 2013 at 8:12 PM, Chen, Pei
  pei.c...@childrens.harvard.edu
wrote:
  
   Chris/Sandeep,
   According to ASF-ICFOSS, I believe the deadline for submitting
  proposals  is this coming Friday (July 19).
   After which point, mentors will have 2 weeks to review and
  score/accept.
   Just curious, are we planning to follow the same process here?  Or
  since  it's all volunteer work, technically- sandeep and still
  contribute code to  the community and participate in the dev group
  here.
  
   Looking forward to it.
   --Pei
  
  
-Original Message-
From: sandeep rg [mailto:sandeep.f...@gmail.com]
Sent: Monday, July 15, 2013 1:05 PM
To: dev@ctakes.apache.org
Subject: Re: to involve in your development group
   
sir,
i gone through most of the ocr technologies and reached a
  conclusion.i
would like to use apache tika and java ocr for this pupose.
   
Tessearact is a ocr tool,it can be used for extracting from
multiple languages.it is implemented in vc++.so it can acceded
using java
  native
function.they provided another  tool tess4j but review says that
it
  has
many bugs.
   
Apache tika developed in java language.it can be used to extract
text
   data
from .xls,word,txt,pdf and other many formats.it is easy for
   implementing
in project also.i have just gone through its implementation way.
   
then about javaocr,its good for extrating text from a jpeg or
scanned images.we can train it with various fonts.more we train
more will be
  its
accuracy but its speed will get decreased.i didn't find any
  particular
documentation for that.
   
   
   
On Sun, Jul 14, 2013 at 9:18 PM, sandeep rg
sandeep.f...@gmail.com
wrote:
   
 thanks a lot for both of your support.I will do my best to find
   solution
 for jira problem.i will share the proposal with both of you..



 On Sun, Jul 14, 2013 at 1:46 AM, Chen, Pei
pei.c...@childrens.harvard.edu
  wrote:

 Sandeep,
 Its great to have Chris on board as well- he was one of the
   coordinators
 of GSoC.
 Looking forward to it.

 Sent from my iPhone

 On Jul 13, 2013, at 12:24 PM, Mattmann, Chris A (398J) 
 chris.a.mattm...@jpl.nasa.gov wrote:

  Hi Sandeep,
 
  That is great news, and good job. OK, for some ideas about
   developing
  your proposal, you may want to simply start with a Google
  Docs,
  and
then
  share it with Pei. I'd be happy to help co-mentor if Pei and
  you

Re: to involve in your development group

2013-07-17 Thread sandeep rg
i hava done sequence diagram and done some small changes,please go through
it and tell me if any more thing is to be included


On Wed, Jul 17, 2013 at 9:37 PM, sandeep rg sandeep.f...@gmail.com wrote:

 it just a skeleton of original proposal


 On Wed, Jul 17, 2013 at 9:31 PM, sandeep rg sandeep.f...@gmail.comwrote:

 the sample work is shared with you both.any more details to be included
 please tell me.
 In which,GUI design,schedule and implementation flow chart design is to
 added which is under construction and will be uploaded within few hours.


 On Wed, Jul 17, 2013 at 7:56 PM, Chen, Pei 
 pei.c...@childrens.harvard.edu wrote:

 pei.stat...@gmail.com

  -Original Message-
  From: Mattmann, Chris A (398J) [mailto:chris.a.mattm...@jpl.nasa.gov]
  Sent: Wednesday, July 17, 2013 10:22 AM
  To: dev@ctakes.apache.org
  Subject: Re: to involve in your development group
 
  chris.mattm...@gmail.com
 
  ++
  
  Chris Mattmann, Ph.D.
  Senior Computer Scientist
  NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
  Office: 171-266B, Mailstop: 171-246
  Email: chris.a.mattm...@nasa.gov
  WWW:  http://sunset.usc.edu/~mattmann/
  ++
  
  Adjunct Assistant Professor, Computer Science Department University of
  Southern California, Los Angeles, CA 90089 USA
  ++
  
 
 
 
 
 
 
  -Original Message-
  From: sandeep rg sandeep.f...@gmail.com
  Reply-To: dev@ctakes.apache.org dev@ctakes.apache.org
  Date: Wednesday, July 17, 2013 6:53 AM
  To: dev@ctakes.apache.org dev@ctakes.apache.org
  Subject: Re: to involve in your development group
 
  can you provide your gmail id to share the proposal document with you?
  
  
  
  On Tue, Jul 16, 2013 at 11:33 PM, sandeep rg sandeep.f...@gmail.com
  wrote:
  
   sir,
   i am providing proposal by two days.now i am mainly going through
  ASF-ICFOSS gateway because if i gone through their way and my
 proposal
  is  get selected,ICFOSS will provide some sort of support such as
  certificates,small financial support etc. to us.
  
  
   but,main thing is i like programming,i like to explore through the
   new technologies in coding and like to interact with the coding.so
 if
   my proposal is got rejected,then also i like to work in your project
   as a volunteer if you allow me..
  
   now i am preparing a proposal,within 2 days i will submit
   it..Mattmann chris helped me to know more about the format of
  proposal.
  
  
   On Tue, Jul 16, 2013 at 8:12 PM, Chen, Pei
  pei.c...@childrens.harvard.edu
wrote:
  
   Chris/Sandeep,
   According to ASF-ICFOSS, I believe the deadline for submitting
  proposals  is this coming Friday (July 19).
   After which point, mentors will have 2 weeks to review and
  score/accept.
   Just curious, are we planning to follow the same process here?  Or
  since  it's all volunteer work, technically- sandeep and still
  contribute code to  the community and participate in the dev group
  here.
  
   Looking forward to it.
   --Pei
  
  
-Original Message-
From: sandeep rg [mailto:sandeep.f...@gmail.com]
Sent: Monday, July 15, 2013 1:05 PM
To: dev@ctakes.apache.org
Subject: Re: to involve in your development group
   
sir,
i gone through most of the ocr technologies and reached a
  conclusion.i
would like to use apache tika and java ocr for this pupose.
   
Tessearact is a ocr tool,it can be used for extracting from
multiple languages.it is implemented in vc++.so it can acceded
using java
  native
function.they provided another  tool tess4j but review says that
it
  has
many bugs.
   
Apache tika developed in java language.it can be used to extract
text
   data
from .xls,word,txt,pdf and other many formats.it is easy for
   implementing
in project also.i have just gone through its implementation way.
   
then about javaocr,its good for extrating text from a jpeg or
scanned images.we can train it with various fonts.more we train
more will be
  its
accuracy but its speed will get decreased.i didn't find any
  particular
documentation for that.
   
   
   
On Sun, Jul 14, 2013 at 9:18 PM, sandeep rg
sandeep.f...@gmail.com
wrote:
   
 thanks a lot for both of your support.I will do my best to find
   solution
 for jira problem.i will share the proposal with both of you..



 On Sun, Jul 14, 2013 at 1:46 AM, Chen, Pei
pei.c...@childrens.harvard.edu
  wrote:

 Sandeep,
 Its great to have Chris on board as well- he was one of the
   coordinators
 of GSoC.
 Looking forward to it.

 Sent from my iPhone

 On Jul 13, 2013, at 12:24 PM, Mattmann, Chris A (398J) 
 chris.a.mattm...@jpl.nasa.gov wrote:

  Hi Sandeep,
 
  That is great news, and good job

RE: to involve in your development group

2013-07-16 Thread Chen, Pei
Chris/Sandeep,
According to ASF-ICFOSS, I believe the deadline for submitting proposals is 
this coming Friday (July 19).  
After which point, mentors will have 2 weeks to review and score/accept.
Just curious, are we planning to follow the same process here?  Or since it's 
all volunteer work, technically- sandeep and still contribute code to the 
community and participate in the dev group here.

Looking forward to it.
--Pei


 -Original Message-
 From: sandeep rg [mailto:sandeep.f...@gmail.com]
 Sent: Monday, July 15, 2013 1:05 PM
 To: dev@ctakes.apache.org
 Subject: Re: to involve in your development group
 
 sir,
 i gone through most of the ocr technologies and reached a conclusion.i
 would like to use apache tika and java ocr for this pupose.
 
 Tessearact is a ocr tool,it can be used for extracting from multiple
 languages.it is implemented in vc++.so it can acceded using java native
 function.they provided another  tool tess4j but review says that it has
 many bugs.
 
 Apache tika developed in java language.it can be used to extract text data
 from .xls,word,txt,pdf and other many formats.it is easy for implementing
 in project also.i have just gone through its implementation way.
 
 then about javaocr,its good for extrating text from a jpeg or scanned
 images.we can train it with various fonts.more we train more will be its
 accuracy but its speed will get decreased.i didn't find any particular
 documentation for that.
 
 
 
 On Sun, Jul 14, 2013 at 9:18 PM, sandeep rg sandeep.f...@gmail.com
 wrote:
 
  thanks a lot for both of your support.I will do my best to find solution
  for jira problem.i will share the proposal with both of you..
 
 
 
  On Sun, Jul 14, 2013 at 1:46 AM, Chen, Pei
 pei.c...@childrens.harvard.edu
   wrote:
 
  Sandeep,
  Its great to have Chris on board as well- he was one of the coordinators
  of GSoC.
  Looking forward to it.
 
  Sent from my iPhone
 
  On Jul 13, 2013, at 12:24 PM, Mattmann, Chris A (398J) 
  chris.a.mattm...@jpl.nasa.gov wrote:
 
   Hi Sandeep,
  
   That is great news, and good job. OK, for some ideas about developing
   your proposal, you may want to simply start with a Google Docs, and
 then
   share it with Pei. I'd be happy to help co-mentor if Pei and you think
   it's useful too.
  
   Your proposal should likely cover:
  
   1. Background - what's the state of CTAKES-189 and what's it trying to
   accomplish
(include some figures, etc. along with your text)
  
   2. Approach - what are you going to do to solve CTAKES-189. Be specific,
   and
try to break it down into smaller, easily reversible steps
  
   3. Schedule - how long and what is the schedule for achieving this?
  
   4. Risks/etc. - any known risks like are you taking a vacation anytime
   soon :)
or are there other time constraints?
  
   5. References, etc.
  
   HTH and I'd be happy if you want to share the GDocs with me as you
  develop
   it.
  
   Cheers!
  
   Chris
  
  
 ++
 
   Chris Mattmann, Ph.D.
   Senior Computer Scientist
   NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
   Office: 171-266B, Mailstop: 171-246
   Email: chris.a.mattm...@nasa.gov
   WWW:  http://sunset.usc.edu/~mattmann/
  
 ++
 
   Adjunct Assistant Professor, Computer Science Department
   University of Southern California, Los Angeles, CA 90089 USA
  
 ++
 
  
  
  
  
  
  
   -Original Message-
   From: sandeep rg sandeep.f...@gmail.com
   Reply-To: dev@ctakes.apache.org dev@ctakes.apache.org
   Date: Saturday, July 13, 2013 8:57 AM
   To: dev@ctakes.apache.org dev@ctakes.apache.org
   Subject: Re: to involve in your development group
  
   i have also gone through the technologies available for development
 of
   ocr,from that i think apache tika and tessearact is best for resolving
  the
   problem.
  
  
   On Sat, Jul 13, 2013 at 9:02 PM, sandeep rg
 sandeep.f...@gmail.com
   wrote:
  
   hi Mattamann Chris,
   i has participated in the event coordinated by luciano resende
  
   http://community.apache.org/mentoringprogramme-icfoss-
 pilot.html
  
   and from that i learned about open source and like to work on your
   project
   ctakes.i would like to fix the jira
  
   https://issues.apache.org/jira/browse/CTAKES-189
  
   chen pei accepted my requested to be my mentor.now i want to give
 a
   proposal to apache about the project i am going to work on.can you
  help
   me
   to prepare a proposal to be submitted before 18 th of this july.
  
  
  
  
  
  
   On Sat, Jul 13, 2013 at 2:26 AM, Mattmann, Chris A (398J) 
   chris.a.mattm...@jpl.nasa.gov wrote:
  
   Hi Sandeep,
  
   I think the best thing to do is:
  
   1. Develop a JIRA issue here:
   https://issues.apache.org/jira/browse/CTAKES
   1a. you can register for a new account on JIRA
   2. Once your JIRA issue

Re: to involve in your development group

2013-07-15 Thread sandeep rg
sir,
i gone through most of the ocr technologies and reached a conclusion.i
would like to use apache tika and java ocr for this pupose.

Tessearact is a ocr tool,it can be used for extracting from multiple
languages.it is implemented in vc++.so it can acceded using java native
function.they provided another  tool tess4j but review says that it has
many bugs.

Apache tika developed in java language.it can be used to extract text data
from .xls,word,txt,pdf and other many formats.it is easy for implementing
in project also.i have just gone through its implementation way.

then about javaocr,its good for extrating text from a jpeg or scanned
images.we can train it with various fonts.more we train more will be its
accuracy but its speed will get decreased.i didn't find any particular
documentation for that.



On Sun, Jul 14, 2013 at 9:18 PM, sandeep rg sandeep.f...@gmail.com wrote:

 thanks a lot for both of your support.I will do my best to find solution
 for jira problem.i will share the proposal with both of you..



 On Sun, Jul 14, 2013 at 1:46 AM, Chen, Pei pei.c...@childrens.harvard.edu
  wrote:

 Sandeep,
 Its great to have Chris on board as well- he was one of the coordinators
 of GSoC.
 Looking forward to it.

 Sent from my iPhone

 On Jul 13, 2013, at 12:24 PM, Mattmann, Chris A (398J) 
 chris.a.mattm...@jpl.nasa.gov wrote:

  Hi Sandeep,
 
  That is great news, and good job. OK, for some ideas about developing
  your proposal, you may want to simply start with a Google Docs, and then
  share it with Pei. I'd be happy to help co-mentor if Pei and you think
  it's useful too.
 
  Your proposal should likely cover:
 
  1. Background - what's the state of CTAKES-189 and what's it trying to
  accomplish
   (include some figures, etc. along with your text)
 
  2. Approach - what are you going to do to solve CTAKES-189. Be specific,
  and
   try to break it down into smaller, easily reversible steps
 
  3. Schedule - how long and what is the schedule for achieving this?
 
  4. Risks/etc. - any known risks like are you taking a vacation anytime
  soon :)
   or are there other time constraints?
 
  5. References, etc.
 
  HTH and I'd be happy if you want to share the GDocs with me as you
 develop
  it.
 
  Cheers!
 
  Chris
 
  ++
  Chris Mattmann, Ph.D.
  Senior Computer Scientist
  NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
  Office: 171-266B, Mailstop: 171-246
  Email: chris.a.mattm...@nasa.gov
  WWW:  http://sunset.usc.edu/~mattmann/
  ++
  Adjunct Assistant Professor, Computer Science Department
  University of Southern California, Los Angeles, CA 90089 USA
  ++
 
 
 
 
 
 
  -Original Message-
  From: sandeep rg sandeep.f...@gmail.com
  Reply-To: dev@ctakes.apache.org dev@ctakes.apache.org
  Date: Saturday, July 13, 2013 8:57 AM
  To: dev@ctakes.apache.org dev@ctakes.apache.org
  Subject: Re: to involve in your development group
 
  i have also gone through the technologies available for development of
  ocr,from that i think apache tika and tessearact is best for resolving
 the
  problem.
 
 
  On Sat, Jul 13, 2013 at 9:02 PM, sandeep rg sandeep.f...@gmail.com
  wrote:
 
  hi Mattamann Chris,
  i has participated in the event coordinated by luciano resende
 
  http://community.apache.org/mentoringprogramme-icfoss-pilot.html
 
  and from that i learned about open source and like to work on your
  project
  ctakes.i would like to fix the jira
 
  https://issues.apache.org/jira/browse/CTAKES-189
 
  chen pei accepted my requested to be my mentor.now i want to give a
  proposal to apache about the project i am going to work on.can you
 help
  me
  to prepare a proposal to be submitted before 18 th of this july.
 
 
 
 
 
 
  On Sat, Jul 13, 2013 at 2:26 AM, Mattmann, Chris A (398J) 
  chris.a.mattm...@jpl.nasa.gov wrote:
 
  Hi Sandeep,
 
  I think the best thing to do is:
 
  1. Develop a JIRA issue here:
  https://issues.apache.org/jira/browse/CTAKES
  1a. you can register for a new account on JIRA
  2. Once your JIRA issue is created, feel free to start a [DISCUSS]
  thread
  (e.g., with subject [DISCUSS] some topic where some topic is
  perhaps
  the main idea you have) on dev@ctakes.apache.org, referencing your
  issue
  and
  asking for feedback
  3. Work with the Apache cTAKES PMC and committers to get your patches
  and
  other items attached to your issue from #1 committed into the sources
 
  Ideally if 1-3 happen and it's a good interaction, Apache is built on
  meritocracy and you could possibly earn the merit to become a PMC
  member
  or committer on the project.
 
  Cheers,
  Chris
 
  ++
  Chris Mattmann, Ph.D.
  Senior Computer Scientist
  NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
  Office: 171-266B

Re: to involve in your development group

2013-07-14 Thread sandeep rg
thanks a lot for both of your support.I will do my best to find solution
for jira problem.i will share the proposal with both of you..



On Sun, Jul 14, 2013 at 1:46 AM, Chen, Pei
pei.c...@childrens.harvard.eduwrote:

 Sandeep,
 Its great to have Chris on board as well- he was one of the coordinators
 of GSoC.
 Looking forward to it.

 Sent from my iPhone

 On Jul 13, 2013, at 12:24 PM, Mattmann, Chris A (398J) 
 chris.a.mattm...@jpl.nasa.gov wrote:

  Hi Sandeep,
 
  That is great news, and good job. OK, for some ideas about developing
  your proposal, you may want to simply start with a Google Docs, and then
  share it with Pei. I'd be happy to help co-mentor if Pei and you think
  it's useful too.
 
  Your proposal should likely cover:
 
  1. Background - what's the state of CTAKES-189 and what's it trying to
  accomplish
   (include some figures, etc. along with your text)
 
  2. Approach - what are you going to do to solve CTAKES-189. Be specific,
  and
   try to break it down into smaller, easily reversible steps
 
  3. Schedule - how long and what is the schedule for achieving this?
 
  4. Risks/etc. - any known risks like are you taking a vacation anytime
  soon :)
   or are there other time constraints?
 
  5. References, etc.
 
  HTH and I'd be happy if you want to share the GDocs with me as you
 develop
  it.
 
  Cheers!
 
  Chris
 
  ++
  Chris Mattmann, Ph.D.
  Senior Computer Scientist
  NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
  Office: 171-266B, Mailstop: 171-246
  Email: chris.a.mattm...@nasa.gov
  WWW:  http://sunset.usc.edu/~mattmann/
  ++
  Adjunct Assistant Professor, Computer Science Department
  University of Southern California, Los Angeles, CA 90089 USA
  ++
 
 
 
 
 
 
  -Original Message-
  From: sandeep rg sandeep.f...@gmail.com
  Reply-To: dev@ctakes.apache.org dev@ctakes.apache.org
  Date: Saturday, July 13, 2013 8:57 AM
  To: dev@ctakes.apache.org dev@ctakes.apache.org
  Subject: Re: to involve in your development group
 
  i have also gone through the technologies available for development of
  ocr,from that i think apache tika and tessearact is best for resolving
 the
  problem.
 
 
  On Sat, Jul 13, 2013 at 9:02 PM, sandeep rg sandeep.f...@gmail.com
  wrote:
 
  hi Mattamann Chris,
  i has participated in the event coordinated by luciano resende
 
  http://community.apache.org/mentoringprogramme-icfoss-pilot.html
 
  and from that i learned about open source and like to work on your
  project
  ctakes.i would like to fix the jira
 
  https://issues.apache.org/jira/browse/CTAKES-189
 
  chen pei accepted my requested to be my mentor.now i want to give a
  proposal to apache about the project i am going to work on.can you help
  me
  to prepare a proposal to be submitted before 18 th of this july.
 
 
 
 
 
 
  On Sat, Jul 13, 2013 at 2:26 AM, Mattmann, Chris A (398J) 
  chris.a.mattm...@jpl.nasa.gov wrote:
 
  Hi Sandeep,
 
  I think the best thing to do is:
 
  1. Develop a JIRA issue here:
  https://issues.apache.org/jira/browse/CTAKES
  1a. you can register for a new account on JIRA
  2. Once your JIRA issue is created, feel free to start a [DISCUSS]
  thread
  (e.g., with subject [DISCUSS] some topic where some topic is
  perhaps
  the main idea you have) on dev@ctakes.apache.org, referencing your
  issue
  and
  asking for feedback
  3. Work with the Apache cTAKES PMC and committers to get your patches
  and
  other items attached to your issue from #1 committed into the sources
 
  Ideally if 1-3 happen and it's a good interaction, Apache is built on
  meritocracy and you could possibly earn the merit to become a PMC
  member
  or committer on the project.
 
  Cheers,
  Chris
 
  ++
  Chris Mattmann, Ph.D.
  Senior Computer Scientist
  NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
  Office: 171-266B, Mailstop: 171-246
  Email: chris.a.mattm...@nasa.gov
  WWW:  http://sunset.usc.edu/~mattmann/
  ++
  Adjunct Assistant Professor, Computer Science Department
  University of Southern California, Los Angeles, CA 90089 USA
  ++
 
 
 
 
 
 
  -Original Message-
  From: sandeep rg sandeep.f...@gmail.com
  Reply-To: dev@ctakes.apache.org dev@ctakes.apache.org
  Date: Thursday, July 11, 2013 11:30 AM
  To: dev@ctakes.apache.org dev@ctakes.apache.org
  Subject: Re: to involve in your development group
 
  can you provide what all details i should include in a
  proposal?whether i
  wanted to include all implemetation(technical) details in the
  proposal?
 
 
  On Thu, Jul 11, 2013 at 9:45 PM, Mattmann, Chris A (398J) 
  chris.a.mattm...@jpl.nasa.gov wrote

Re: to involve in your development group

2013-07-13 Thread sandeep rg
hi Mattamann Chris,
i has participated in the event coordinated by luciano resende

http://community.apache.org/mentoringprogramme-icfoss-pilot.html

and from that i learned about open source and like to work on your project
ctakes.i would like to fix the jira

https://issues.apache.org/jira/browse/CTAKES-189

chen pei accepted my requested to be my mentor.now i want to give a
proposal to apache about the project i am going to work on.can you help me
to prepare a proposal to be submitted before 18 th of this july.






On Sat, Jul 13, 2013 at 2:26 AM, Mattmann, Chris A (398J) 
chris.a.mattm...@jpl.nasa.gov wrote:

 Hi Sandeep,

 I think the best thing to do is:

 1. Develop a JIRA issue here: https://issues.apache.org/jira/browse/CTAKES
  1a. you can register for a new account on JIRA
 2. Once your JIRA issue is created, feel free to start a [DISCUSS] thread
 (e.g., with subject [DISCUSS] some topic where some topic is perhaps
 the main idea you have) on dev@ctakes.apache.org, referencing your issue
 and
 asking for feedback
 3. Work with the Apache cTAKES PMC and committers to get your patches and
 other items attached to your issue from #1 committed into the sources

 Ideally if 1-3 happen and it's a good interaction, Apache is built on
 meritocracy and you could possibly earn the merit to become a PMC member
 or committer on the project.

 Cheers,
 Chris

 ++
 Chris Mattmann, Ph.D.
 Senior Computer Scientist
 NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
 Office: 171-266B, Mailstop: 171-246
 Email: chris.a.mattm...@nasa.gov
 WWW:  http://sunset.usc.edu/~mattmann/
 ++
 Adjunct Assistant Professor, Computer Science Department
 University of Southern California, Los Angeles, CA 90089 USA
 ++






 -Original Message-
 From: sandeep rg sandeep.f...@gmail.com
 Reply-To: dev@ctakes.apache.org dev@ctakes.apache.org
 Date: Thursday, July 11, 2013 11:30 AM
 To: dev@ctakes.apache.org dev@ctakes.apache.org
 Subject: Re: to involve in your development group

 can you provide what all details i should include in a proposal?whether i
 wanted to include all implemetation(technical) details in the proposal?
 
 
 On Thu, Jul 11, 2013 at 9:45 PM, Mattmann, Chris A (398J) 
 chris.a.mattm...@jpl.nasa.gov wrote:
 
  Dear Sandeep,
 
  Thanks for your interest in cTAKES. We would welcome your contribution
  and are happy to have your interest in the project.
 
  Cheers,
  Chris
 
  ++
  Chris Mattmann, Ph.D.
  Senior Computer Scientist
  NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
  Office: 171-266B, Mailstop: 171-246
  Email: chris.a.mattm...@nasa.gov
  WWW:  http://sunset.usc.edu/~mattmann/
  ++
  Adjunct Assistant Professor, Computer Science Department
  University of Southern California, Los Angeles, CA 90089 USA
  ++
 
 
 
 
 
 
  -Original Message-
  From: sandeep rg sandeep.f...@gmail.com
  Reply-To: dev@ctakes.apache.org dev@ctakes.apache.org
  Date: Wednesday, July 10, 2013 11:01 AM
  To: dev@ctakes.apache.org dev@ctakes.apache.org
  Subject: Re: to involve in your development group
 
  sir,
  
  My name is sandeep rg.i am a btech graduate in computer science.now
 doing
  an internship in a company in java language.
  
  then  i had installed all things succesfully,now downloading the
  resource.ittake too much time.
  
  i have gone through the suggested ocr technologies.
  Javaocr has some good user review.
  Apache tika has a capability to process different types of format.
  More than that there is tesserract which are also used for ocr purpose.
  then apache pdfbox is also used for text extratcion but only for pdf
  files.
  now i am going through every thing to find out best technology from
 this.
  
  
  On Wed, Jul 10, 2013 at 12:52 AM, Chen, Pei
  pei.c...@childrens.harvard.eduwrote:
  
   Hi Sandeep,
   I am delighted to work with you on this project.
  
   I was not sure if I understood you correctly- did you mean to say
 that
  you
   have already tried using cTAKES and it's components?
   If not, you can do an svn checkout of the code and try running the
   debugger gui from the command line (or eclipseide) that will allow
 you
  to
   type in plain text and get back the different structured content
 (types)
   that cTAKES produces:
   MAVEN_OPTS=-Xmx2g -Xms1g
   mvn -PrunCVD compile
   From the guide:
  
  
 
 
 https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.0+Developer+I
  nstall+Guide
  
   A bit of background:
   Apache cTAKES uses SVN for version on control:
   https://svn.apache.org/repos/asf/ctakes/trunk/
   Jira for issues tracking:
   https://issues.apache.org/jira

Re: to involve in your development group

2013-07-13 Thread Mattmann, Chris A (398J)
Hi Sandeep,

That is great news, and good job. OK, for some ideas about developing
your proposal, you may want to simply start with a Google Docs, and then
share it with Pei. I'd be happy to help co-mentor if Pei and you think
it's useful too.

Your proposal should likely cover:

1. Background - what's the state of CTAKES-189 and what's it trying to
accomplish
  (include some figures, etc. along with your text)

2. Approach - what are you going to do to solve CTAKES-189. Be specific,
and 
  try to break it down into smaller, easily reversible steps

3. Schedule - how long and what is the schedule for achieving this?

4. Risks/etc. - any known risks like are you taking a vacation anytime
soon :)
  or are there other time constraints?

5. References, etc.

HTH and I'd be happy if you want to share the GDocs with me as you develop
it.

Cheers!

Chris

++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattm...@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++






-Original Message-
From: sandeep rg sandeep.f...@gmail.com
Reply-To: dev@ctakes.apache.org dev@ctakes.apache.org
Date: Saturday, July 13, 2013 8:57 AM
To: dev@ctakes.apache.org dev@ctakes.apache.org
Subject: Re: to involve in your development group

i have also gone through the technologies available for development of
ocr,from that i think apache tika and tessearact is best for resolving the
problem.


On Sat, Jul 13, 2013 at 9:02 PM, sandeep rg sandeep.f...@gmail.com
wrote:

 hi Mattamann Chris,
 i has participated in the event coordinated by luciano resende

 http://community.apache.org/mentoringprogramme-icfoss-pilot.html

 and from that i learned about open source and like to work on your
project
 ctakes.i would like to fix the jira

 https://issues.apache.org/jira/browse/CTAKES-189

 chen pei accepted my requested to be my mentor.now i want to give a
 proposal to apache about the project i am going to work on.can you help
me
 to prepare a proposal to be submitted before 18 th of this july.






 On Sat, Jul 13, 2013 at 2:26 AM, Mattmann, Chris A (398J) 
 chris.a.mattm...@jpl.nasa.gov wrote:

 Hi Sandeep,

 I think the best thing to do is:

 1. Develop a JIRA issue here:
 https://issues.apache.org/jira/browse/CTAKES
  1a. you can register for a new account on JIRA
 2. Once your JIRA issue is created, feel free to start a [DISCUSS]
thread
 (e.g., with subject [DISCUSS] some topic where some topic is
perhaps
 the main idea you have) on dev@ctakes.apache.org, referencing your
issue
 and
 asking for feedback
 3. Work with the Apache cTAKES PMC and committers to get your patches
and
 other items attached to your issue from #1 committed into the sources

 Ideally if 1-3 happen and it's a good interaction, Apache is built on
 meritocracy and you could possibly earn the merit to become a PMC
member
 or committer on the project.

 Cheers,
 Chris

 ++
 Chris Mattmann, Ph.D.
 Senior Computer Scientist
 NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
 Office: 171-266B, Mailstop: 171-246
 Email: chris.a.mattm...@nasa.gov
 WWW:  http://sunset.usc.edu/~mattmann/
 ++
 Adjunct Assistant Professor, Computer Science Department
 University of Southern California, Los Angeles, CA 90089 USA
 ++






 -Original Message-
 From: sandeep rg sandeep.f...@gmail.com
 Reply-To: dev@ctakes.apache.org dev@ctakes.apache.org
 Date: Thursday, July 11, 2013 11:30 AM
 To: dev@ctakes.apache.org dev@ctakes.apache.org
 Subject: Re: to involve in your development group

 can you provide what all details i should include in a
proposal?whether i
 wanted to include all implemetation(technical) details in the
proposal?
 
 
 On Thu, Jul 11, 2013 at 9:45 PM, Mattmann, Chris A (398J) 
 chris.a.mattm...@jpl.nasa.gov wrote:
 
  Dear Sandeep,
 
  Thanks for your interest in cTAKES. We would welcome your
contribution
  and are happy to have your interest in the project.
 
  Cheers,
  Chris
 
  ++
  Chris Mattmann, Ph.D.
  Senior Computer Scientist
  NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
  Office: 171-266B, Mailstop: 171-246
  Email: chris.a.mattm...@nasa.gov
  WWW:  http://sunset.usc.edu/~mattmann/
  ++
  Adjunct Assistant Professor, Computer Science Department
  University of Southern California, Los Angeles, CA 90089 USA

Re: to involve in your development group

2013-07-12 Thread Mattmann, Chris A (398J)
Hi Sandeep,

I think the best thing to do is:

1. Develop a JIRA issue here: https://issues.apache.org/jira/browse/CTAKES
 1a. you can register for a new account on JIRA
2. Once your JIRA issue is created, feel free to start a [DISCUSS] thread
(e.g., with subject [DISCUSS] some topic where some topic is perhaps
the main idea you have) on dev@ctakes.apache.org, referencing your issue
and
asking for feedback
3. Work with the Apache cTAKES PMC and committers to get your patches and
other items attached to your issue from #1 committed into the sources

Ideally if 1-3 happen and it's a good interaction, Apache is built on
meritocracy and you could possibly earn the merit to become a PMC member
or committer on the project.

Cheers,
Chris

++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattm...@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++






-Original Message-
From: sandeep rg sandeep.f...@gmail.com
Reply-To: dev@ctakes.apache.org dev@ctakes.apache.org
Date: Thursday, July 11, 2013 11:30 AM
To: dev@ctakes.apache.org dev@ctakes.apache.org
Subject: Re: to involve in your development group

can you provide what all details i should include in a proposal?whether i
wanted to include all implemetation(technical) details in the proposal?


On Thu, Jul 11, 2013 at 9:45 PM, Mattmann, Chris A (398J) 
chris.a.mattm...@jpl.nasa.gov wrote:

 Dear Sandeep,

 Thanks for your interest in cTAKES. We would welcome your contribution
 and are happy to have your interest in the project.

 Cheers,
 Chris

 ++
 Chris Mattmann, Ph.D.
 Senior Computer Scientist
 NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
 Office: 171-266B, Mailstop: 171-246
 Email: chris.a.mattm...@nasa.gov
 WWW:  http://sunset.usc.edu/~mattmann/
 ++
 Adjunct Assistant Professor, Computer Science Department
 University of Southern California, Los Angeles, CA 90089 USA
 ++






 -Original Message-
 From: sandeep rg sandeep.f...@gmail.com
 Reply-To: dev@ctakes.apache.org dev@ctakes.apache.org
 Date: Wednesday, July 10, 2013 11:01 AM
 To: dev@ctakes.apache.org dev@ctakes.apache.org
 Subject: Re: to involve in your development group

 sir,
 
 My name is sandeep rg.i am a btech graduate in computer science.now
doing
 an internship in a company in java language.
 
 then  i had installed all things succesfully,now downloading the
 resource.ittake too much time.
 
 i have gone through the suggested ocr technologies.
 Javaocr has some good user review.
 Apache tika has a capability to process different types of format.
 More than that there is tesserract which are also used for ocr purpose.
 then apache pdfbox is also used for text extratcion but only for pdf
 files.
 now i am going through every thing to find out best technology from
this.
 
 
 On Wed, Jul 10, 2013 at 12:52 AM, Chen, Pei
 pei.c...@childrens.harvard.eduwrote:
 
  Hi Sandeep,
  I am delighted to work with you on this project.
 
  I was not sure if I understood you correctly- did you mean to say
that
 you
  have already tried using cTAKES and it's components?
  If not, you can do an svn checkout of the code and try running the
  debugger gui from the command line (or eclipseide) that will allow
you
 to
  type in plain text and get back the different structured content
(types)
  that cTAKES produces:
  MAVEN_OPTS=-Xmx2g -Xms1g
  mvn -PrunCVD compile
  From the guide:
 
 
 
https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.0+Developer+I
 nstall+Guide
 
  A bit of background:
  Apache cTAKES uses SVN for version on control:
  https://svn.apache.org/repos/asf/ctakes/trunk/
  Jira for issues tracking:
  https://issues.apache.org/jira/browse/ctakes
  Maven for building and dependency management.
  A lot of the developers use Eclipse IDE for their development.
  More info on ctakes.apache.org
 
  cTAKES is built on top of the Apache UIMA Framework.  Essentially,
 cTAKES
  is a collection of Annotators (Java Classes) and wired together to
into
 a
  pipeline.
  It's goal in a nutshell is to turn unstructured plain text into
  structured/normalized form and specially trained for medical notes.
  Right now- the input cTAKES expects would be in plain text form and
 cTAKES
  does not have an OCR component.
  cTAKE-189:GSoC:implement OCR/tika to standardize text inputs was an
idea
  to allow cTAKES to take in any type of input (PDF, Images, Word, XLS,
 etc.)
  and pass the text

Re: to involve in your development group

2013-07-11 Thread Mattmann, Chris A (398J)
Dear Sandeep,

Thanks for your interest in cTAKES. We would welcome your contribution
and are happy to have your interest in the project.

Cheers,
Chris

++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattm...@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++






-Original Message-
From: sandeep rg sandeep.f...@gmail.com
Reply-To: dev@ctakes.apache.org dev@ctakes.apache.org
Date: Wednesday, July 10, 2013 11:01 AM
To: dev@ctakes.apache.org dev@ctakes.apache.org
Subject: Re: to involve in your development group

sir,

My name is sandeep rg.i am a btech graduate in computer science.now doing
an internship in a company in java language.

then  i had installed all things succesfully,now downloading the
resource.ittake too much time.

i have gone through the suggested ocr technologies.
Javaocr has some good user review.
Apache tika has a capability to process different types of format.
More than that there is tesserract which are also used for ocr purpose.
then apache pdfbox is also used for text extratcion but only for pdf
files.
now i am going through every thing to find out best technology from this.


On Wed, Jul 10, 2013 at 12:52 AM, Chen, Pei
pei.c...@childrens.harvard.eduwrote:

 Hi Sandeep,
 I am delighted to work with you on this project.

 I was not sure if I understood you correctly- did you mean to say that
you
 have already tried using cTAKES and it's components?
 If not, you can do an svn checkout of the code and try running the
 debugger gui from the command line (or eclipseide) that will allow you
to
 type in plain text and get back the different structured content (types)
 that cTAKES produces:
 MAVEN_OPTS=-Xmx2g -Xms1g
 mvn -PrunCVD compile
 From the guide:
 
https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.0+Developer+I
nstall+Guide

 A bit of background:
 Apache cTAKES uses SVN for version on control:
 https://svn.apache.org/repos/asf/ctakes/trunk/
 Jira for issues tracking:
 https://issues.apache.org/jira/browse/ctakes
 Maven for building and dependency management.
 A lot of the developers use Eclipse IDE for their development.
 More info on ctakes.apache.org

 cTAKES is built on top of the Apache UIMA Framework.  Essentially,
cTAKES
 is a collection of Annotators (Java Classes) and wired together to into
a
 pipeline.
 It's goal in a nutshell is to turn unstructured plain text into
 structured/normalized form and specially trained for medical notes.
 Right now- the input cTAKES expects would be in plain text form and
cTAKES
 does not have an OCR component.
 cTAKE-189:GSoC:implement OCR/tika to standardize text inputs was an idea
 to allow cTAKES to take in any type of input (PDF, Images, Word, XLS,
etc.)
 and pass the text for cTAKES processing.
 [I was originally thinking this could be done in some kind of
 preprocessing, or an optional Annotator that could be added in the
 beginning of a pipeline].  There may be some existing work that could be
 potentially reused: Apache Tika (
 https://issues.apache.org/jira/browse/TIKA-93 ) as well as some open
 source OCR toolkits (JavaOCR).

 About Me:

 
http://childrenshospital.org/cfapps/research/data_admin/Site3240/mainpage
S3240P8.html
 http://www.linkedin.com/in/peistation
 http://people.apache.org/committer-index.html#chenpei

  -Original Message-
  From: sandeep rg [mailto:sandeep.f...@gmail.com]
  Sent: Tuesday, July 09, 2013 1:19 PM
  To: dev@ctakes.apache.org
  Subject: Re: to involve in your development group
 
  Thanks a lot for giving me support.i like to work with you.
 
  I have gone through the objectives of the software,used the software
and
  gone through various components of the project.can you provide me
 starting
  point from where i should start to know more about the coding part of
the
  project.
 
  can you tell me more about the project and about you also?
 
 
  On Tue, Jul 9, 2013 at 1:14 AM, Chen, Pei
  pei.c...@childrens.harvard.eduwrote:
 
   Hi Sandeep,
   Thank you for the interest.  I just had a quick look at the ICFOSS
   pilot mentoring program and will be happy to serve as a mentor for
   your project
   proposal(s) if you are interested.
  
   --Pei
  
-Original Message-
From: sandeep rg [mailto:sandeep.f...@gmail.com]
Sent: Monday, July 08, 2013 2:24 PM
To: dev@ctakes.apache.org
Subject: Re: to involve in your development group
   
sir,
   
details of the program Pilot mentoring programme with india ICFOSS
is
   given
in the below web address
   
http://community.apache.org/mentoringprogramme-icfoss-pilot.html

Re: to involve in your development group

2013-07-11 Thread sandeep rg
can you provide what all details i should include in a proposal?whether i
wanted to include all implemetation(technical) details in the proposal?


On Thu, Jul 11, 2013 at 9:45 PM, Mattmann, Chris A (398J) 
chris.a.mattm...@jpl.nasa.gov wrote:

 Dear Sandeep,

 Thanks for your interest in cTAKES. We would welcome your contribution
 and are happy to have your interest in the project.

 Cheers,
 Chris

 ++
 Chris Mattmann, Ph.D.
 Senior Computer Scientist
 NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
 Office: 171-266B, Mailstop: 171-246
 Email: chris.a.mattm...@nasa.gov
 WWW:  http://sunset.usc.edu/~mattmann/
 ++
 Adjunct Assistant Professor, Computer Science Department
 University of Southern California, Los Angeles, CA 90089 USA
 ++






 -Original Message-
 From: sandeep rg sandeep.f...@gmail.com
 Reply-To: dev@ctakes.apache.org dev@ctakes.apache.org
 Date: Wednesday, July 10, 2013 11:01 AM
 To: dev@ctakes.apache.org dev@ctakes.apache.org
 Subject: Re: to involve in your development group

 sir,
 
 My name is sandeep rg.i am a btech graduate in computer science.now doing
 an internship in a company in java language.
 
 then  i had installed all things succesfully,now downloading the
 resource.ittake too much time.
 
 i have gone through the suggested ocr technologies.
 Javaocr has some good user review.
 Apache tika has a capability to process different types of format.
 More than that there is tesserract which are also used for ocr purpose.
 then apache pdfbox is also used for text extratcion but only for pdf
 files.
 now i am going through every thing to find out best technology from this.
 
 
 On Wed, Jul 10, 2013 at 12:52 AM, Chen, Pei
 pei.c...@childrens.harvard.eduwrote:
 
  Hi Sandeep,
  I am delighted to work with you on this project.
 
  I was not sure if I understood you correctly- did you mean to say that
 you
  have already tried using cTAKES and it's components?
  If not, you can do an svn checkout of the code and try running the
  debugger gui from the command line (or eclipseide) that will allow you
 to
  type in plain text and get back the different structured content (types)
  that cTAKES produces:
  MAVEN_OPTS=-Xmx2g -Xms1g
  mvn -PrunCVD compile
  From the guide:
 
 
 https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.0+Developer+I
 nstall+Guide
 
  A bit of background:
  Apache cTAKES uses SVN for version on control:
  https://svn.apache.org/repos/asf/ctakes/trunk/
  Jira for issues tracking:
  https://issues.apache.org/jira/browse/ctakes
  Maven for building and dependency management.
  A lot of the developers use Eclipse IDE for their development.
  More info on ctakes.apache.org
 
  cTAKES is built on top of the Apache UIMA Framework.  Essentially,
 cTAKES
  is a collection of Annotators (Java Classes) and wired together to into
 a
  pipeline.
  It's goal in a nutshell is to turn unstructured plain text into
  structured/normalized form and specially trained for medical notes.
  Right now- the input cTAKES expects would be in plain text form and
 cTAKES
  does not have an OCR component.
  cTAKE-189:GSoC:implement OCR/tika to standardize text inputs was an idea
  to allow cTAKES to take in any type of input (PDF, Images, Word, XLS,
 etc.)
  and pass the text for cTAKES processing.
  [I was originally thinking this could be done in some kind of
  preprocessing, or an optional Annotator that could be added in the
  beginning of a pipeline].  There may be some existing work that could be
  potentially reused: Apache Tika (
  https://issues.apache.org/jira/browse/TIKA-93 ) as well as some open
  source OCR toolkits (JavaOCR).
 
  About Me:
 
 
 
 http://childrenshospital.org/cfapps/research/data_admin/Site3240/mainpage
 S3240P8.html
  http://www.linkedin.com/in/peistation
  http://people.apache.org/committer-index.html#chenpei
 
   -Original Message-
   From: sandeep rg [mailto:sandeep.f...@gmail.com]
   Sent: Tuesday, July 09, 2013 1:19 PM
   To: dev@ctakes.apache.org
   Subject: Re: to involve in your development group
  
   Thanks a lot for giving me support.i like to work with you.
  
   I have gone through the objectives of the software,used the software
 and
   gone through various components of the project.can you provide me
  starting
   point from where i should start to know more about the coding part of
 the
   project.
  
   can you tell me more about the project and about you also?
  
  
   On Tue, Jul 9, 2013 at 1:14 AM, Chen, Pei
   pei.c...@childrens.harvard.eduwrote:
  
Hi Sandeep,
Thank you for the interest.  I just had a quick look at the ICFOSS
pilot mentoring program and will be happy to serve as a mentor for
your project
proposal(s) if you are interested.
   
--Pei
   
 -Original Message-
 From: sandeep

Re: to involve in your development group

2013-07-10 Thread sandeep rg
sir,

My name is sandeep rg.i am a btech graduate in computer science.now doing
an internship in a company in java language.

then  i had installed all things succesfully,now downloading the
resource.ittake too much time.

i have gone through the suggested ocr technologies.
Javaocr has some good user review.
Apache tika has a capability to process different types of format.
More than that there is tesserract which are also used for ocr purpose.
then apache pdfbox is also used for text extratcion but only for pdf files.
now i am going through every thing to find out best technology from this.


On Wed, Jul 10, 2013 at 12:52 AM, Chen, Pei
pei.c...@childrens.harvard.eduwrote:

 Hi Sandeep,
 I am delighted to work with you on this project.

 I was not sure if I understood you correctly- did you mean to say that you
 have already tried using cTAKES and it's components?
 If not, you can do an svn checkout of the code and try running the
 debugger gui from the command line (or eclipseide) that will allow you to
 type in plain text and get back the different structured content (types)
 that cTAKES produces:
 MAVEN_OPTS=-Xmx2g -Xms1g
 mvn -PrunCVD compile
 From the guide:
 https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.0+Developer+Install+Guide

 A bit of background:
 Apache cTAKES uses SVN for version on control:
 https://svn.apache.org/repos/asf/ctakes/trunk/
 Jira for issues tracking:
 https://issues.apache.org/jira/browse/ctakes
 Maven for building and dependency management.
 A lot of the developers use Eclipse IDE for their development.
 More info on ctakes.apache.org

 cTAKES is built on top of the Apache UIMA Framework.  Essentially, cTAKES
 is a collection of Annotators (Java Classes) and wired together to into a
 pipeline.
 It's goal in a nutshell is to turn unstructured plain text into
 structured/normalized form and specially trained for medical notes.
 Right now- the input cTAKES expects would be in plain text form and cTAKES
 does not have an OCR component.
 cTAKE-189:GSoC:implement OCR/tika to standardize text inputs was an idea
 to allow cTAKES to take in any type of input (PDF, Images, Word, XLS, etc.)
 and pass the text for cTAKES processing.
 [I was originally thinking this could be done in some kind of
 preprocessing, or an optional Annotator that could be added in the
 beginning of a pipeline].  There may be some existing work that could be
 potentially reused: Apache Tika (
 https://issues.apache.org/jira/browse/TIKA-93 ) as well as some open
 source OCR toolkits (JavaOCR).

 About Me:

 http://childrenshospital.org/cfapps/research/data_admin/Site3240/mainpageS3240P8.html
 http://www.linkedin.com/in/peistation
 http://people.apache.org/committer-index.html#chenpei

  -Original Message-
  From: sandeep rg [mailto:sandeep.f...@gmail.com]
  Sent: Tuesday, July 09, 2013 1:19 PM
  To: dev@ctakes.apache.org
  Subject: Re: to involve in your development group
 
  Thanks a lot for giving me support.i like to work with you.
 
  I have gone through the objectives of the software,used the software and
  gone through various components of the project.can you provide me
 starting
  point from where i should start to know more about the coding part of the
  project.
 
  can you tell me more about the project and about you also?
 
 
  On Tue, Jul 9, 2013 at 1:14 AM, Chen, Pei
  pei.c...@childrens.harvard.eduwrote:
 
   Hi Sandeep,
   Thank you for the interest.  I just had a quick look at the ICFOSS
   pilot mentoring program and will be happy to serve as a mentor for
   your project
   proposal(s) if you are interested.
  
   --Pei
  
-Original Message-
From: sandeep rg [mailto:sandeep.f...@gmail.com]
Sent: Monday, July 08, 2013 2:24 PM
To: dev@ctakes.apache.org
Subject: Re: to involve in your development group
   
sir,
   
details of the program Pilot mentoring programme with india ICFOSS
is
   given
in the below web address
   
http://community.apache.org/mentoringprogramme-icfoss-pilot.html
   
   
I am new to this community so i need a mentor for the project.It
will be
   more
helpful for me..
   
   
On Mon, Jul 8, 2013 at 7:22 PM, Chen, Pei
pei.c...@childrens.harvard.eduwrote:
   
 Hi Sandeep,
 Welcome!  I am not familiar with the details of icfoss-apache, but
 please- you are more than welcome to work on the code and
 contributions will be greatly appreciated!
 There may be a learning curve, but feel free let us know if you
 have any questions/issues.
 Thanks,
 Pei

  -Original Message-
  From: sandeep rg [mailto:sandeep.f...@gmail.com]
  Sent: Saturday, July 06, 2013 11:50 AM
  To: dev@ctakes.apache.org
  Subject: to involve in your development group
 
   my name is sandeep.i am btech graduate.i had participated in a
  camp coordinated in kerala,India in association with
  icfoss-apache called as
 youth
  mentoring programme

RE: to involve in your development group

2013-07-09 Thread Chen, Pei
Hi Sandeep,
I am delighted to work with you on this project.

I was not sure if I understood you correctly- did you mean to say that you have 
already tried using cTAKES and it's components?
If not, you can do an svn checkout of the code and try running the debugger gui 
from the command line (or eclipseide) that will allow you to type in plain text 
and get back the different structured content (types) that cTAKES produces:
MAVEN_OPTS=-Xmx2g -Xms1g
mvn -PrunCVD compile
From the guide: 
https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.0+Developer+Install+Guide

A bit of background:
Apache cTAKES uses SVN for version on control:
https://svn.apache.org/repos/asf/ctakes/trunk/
Jira for issues tracking:
https://issues.apache.org/jira/browse/ctakes
Maven for building and dependency management.
A lot of the developers use Eclipse IDE for their development.
More info on ctakes.apache.org

cTAKES is built on top of the Apache UIMA Framework.  Essentially, cTAKES is a 
collection of Annotators (Java Classes) and wired together to into a pipeline. 
It's goal in a nutshell is to turn unstructured plain text into 
structured/normalized form and specially trained for medical notes.
Right now- the input cTAKES expects would be in plain text form and cTAKES does 
not have an OCR component.  
cTAKE-189:GSoC:implement OCR/tika to standardize text inputs was an idea to 
allow cTAKES to take in any type of input (PDF, Images, Word, XLS, etc.) and 
pass the text for cTAKES processing.
[I was originally thinking this could be done in some kind of preprocessing, or 
an optional Annotator that could be added in the beginning of a pipeline].  
There may be some existing work that could be potentially reused: Apache Tika 
(https://issues.apache.org/jira/browse/TIKA-93 ) as well as some open source 
OCR toolkits (JavaOCR).

About Me:
http://childrenshospital.org/cfapps/research/data_admin/Site3240/mainpageS3240P8.html
http://www.linkedin.com/in/peistation
http://people.apache.org/committer-index.html#chenpei

 -Original Message-
 From: sandeep rg [mailto:sandeep.f...@gmail.com]
 Sent: Tuesday, July 09, 2013 1:19 PM
 To: dev@ctakes.apache.org
 Subject: Re: to involve in your development group
 
 Thanks a lot for giving me support.i like to work with you.
 
 I have gone through the objectives of the software,used the software and
 gone through various components of the project.can you provide me starting
 point from where i should start to know more about the coding part of the
 project.
 
 can you tell me more about the project and about you also?
 
 
 On Tue, Jul 9, 2013 at 1:14 AM, Chen, Pei
 pei.c...@childrens.harvard.eduwrote:
 
  Hi Sandeep,
  Thank you for the interest.  I just had a quick look at the ICFOSS
  pilot mentoring program and will be happy to serve as a mentor for
  your project
  proposal(s) if you are interested.
 
  --Pei
 
   -Original Message-
   From: sandeep rg [mailto:sandeep.f...@gmail.com]
   Sent: Monday, July 08, 2013 2:24 PM
   To: dev@ctakes.apache.org
   Subject: Re: to involve in your development group
  
   sir,
  
   details of the program Pilot mentoring programme with india ICFOSS
   is
  given
   in the below web address
  
   http://community.apache.org/mentoringprogramme-icfoss-pilot.html
  
  
   I am new to this community so i need a mentor for the project.It
   will be
  more
   helpful for me..
  
  
   On Mon, Jul 8, 2013 at 7:22 PM, Chen, Pei
   pei.c...@childrens.harvard.eduwrote:
  
Hi Sandeep,
Welcome!  I am not familiar with the details of icfoss-apache, but
please- you are more than welcome to work on the code and
contributions will be greatly appreciated!
There may be a learning curve, but feel free let us know if you
have any questions/issues.
Thanks,
Pei
   
 -Original Message-
 From: sandeep rg [mailto:sandeep.f...@gmail.com]
 Sent: Saturday, July 06, 2013 11:50 AM
 To: dev@ctakes.apache.org
 Subject: to involve in your development group

  my name is sandeep.i am btech graduate.i had participated in a
 camp coordinated in kerala,India in association with
 icfoss-apache called as
youth
 mentoring programme coordinated by Luciano resende.

 i like the project and
 like to
involve in your project as a
 programmer.i have gone through the your project and gone through
 the bugs list.I like to work on the bug
 cTAKE-189:GSoC:implement OCR/tika to standardize text inputs
 for cTAKES.can you allow me to
  work
   on that?
   
 


RE: to involve in your development group

2013-07-08 Thread Chen, Pei
Hi Sandeep,
Welcome!  I am not familiar with the details of icfoss-apache, but please- you 
are more than welcome to work on the code and contributions will be greatly 
appreciated!
There may be a learning curve, but feel free let us know if you have any 
questions/issues.
Thanks,
Pei

 -Original Message-
 From: sandeep rg [mailto:sandeep.f...@gmail.com]
 Sent: Saturday, July 06, 2013 11:50 AM
 To: dev@ctakes.apache.org
 Subject: to involve in your development group
 
  my name is sandeep.i am btech graduate.i had participated in a camp
 coordinated in kerala,India in association with icfoss-apache called as youth
 mentoring programme coordinated by Luciano resende.
 
 i like the project and like to 
 involve in your project as a
 programmer.i have gone through the your project and gone through the
 bugs list.I like to work on the bug cTAKE-189:GSoC:implement OCR/tika to
 standardize text inputs for cTAKES.can you allow me to work on that?


Re: to involve in your development group

2013-07-08 Thread sandeep rg
sir,

details of the program Pilot mentoring programme with india ICFOSS is given
in the below web address

http://community.apache.org/mentoringprogramme-icfoss-pilot.html


I am new to this community so i need a mentor for the project.It will be
more helpful for me..


On Mon, Jul 8, 2013 at 7:22 PM, Chen, Pei pei.c...@childrens.harvard.eduwrote:

 Hi Sandeep,
 Welcome!  I am not familiar with the details of icfoss-apache, but please-
 you are more than welcome to work on the code and contributions will be
 greatly appreciated!
 There may be a learning curve, but feel free let us know if you have any
 questions/issues.
 Thanks,
 Pei

  -Original Message-
  From: sandeep rg [mailto:sandeep.f...@gmail.com]
  Sent: Saturday, July 06, 2013 11:50 AM
  To: dev@ctakes.apache.org
  Subject: to involve in your development group
 
   my name is sandeep.i am btech graduate.i had participated in a camp
  coordinated in kerala,India in association with icfoss-apache called as
 youth
  mentoring programme coordinated by Luciano resende.
 
  i like the project and like to
 involve in your project as a
  programmer.i have gone through the your project and gone through the
  bugs list.I like to work on the bug cTAKE-189:GSoC:implement OCR/tika to
  standardize text inputs for cTAKES.can you allow me to work on that?



RE: to involve in your development group

2013-07-08 Thread Chen, Pei
Hi Sandeep,
Thank you for the interest.  I just had a quick look at the ICFOSS pilot 
mentoring program and will be happy to serve as a mentor for your project 
proposal(s) if you are interested.

--Pei

 -Original Message-
 From: sandeep rg [mailto:sandeep.f...@gmail.com]
 Sent: Monday, July 08, 2013 2:24 PM
 To: dev@ctakes.apache.org
 Subject: Re: to involve in your development group
 
 sir,
 
 details of the program Pilot mentoring programme with india ICFOSS is given
 in the below web address
 
 http://community.apache.org/mentoringprogramme-icfoss-pilot.html
 
 
 I am new to this community so i need a mentor for the project.It will be more
 helpful for me..
 
 
 On Mon, Jul 8, 2013 at 7:22 PM, Chen, Pei
 pei.c...@childrens.harvard.eduwrote:
 
  Hi Sandeep,
  Welcome!  I am not familiar with the details of icfoss-apache, but
  please- you are more than welcome to work on the code and
  contributions will be greatly appreciated!
  There may be a learning curve, but feel free let us know if you have
  any questions/issues.
  Thanks,
  Pei
 
   -Original Message-
   From: sandeep rg [mailto:sandeep.f...@gmail.com]
   Sent: Saturday, July 06, 2013 11:50 AM
   To: dev@ctakes.apache.org
   Subject: to involve in your development group
  
my name is sandeep.i am btech graduate.i had participated in a camp
   coordinated in kerala,India in association with icfoss-apache called
   as
  youth
   mentoring programme coordinated by Luciano resende.
  
   i like the project and like
   to
  involve in your project as a
   programmer.i have gone through the your project and gone through the
   bugs list.I like to work on the bug cTAKE-189:GSoC:implement
   OCR/tika to standardize text inputs for cTAKES.can you allow me to work
 on that?