Hi Chris, Thanks for the reply, I tried to logging to [1], but I couldn't able to login into that my username is "Madhawa Gunasekara" [1] https://wiki.apache.org/tika/GSoC2016
I have created a jira issue on https://issues.apache.org/jira/browse/TIKA-1911 Thanks, Madhawa Madhawa On Sat, Mar 26, 2016 at 3:21 AM, Mattmann, Chris A (3980) < chris.a.mattm...@jpl.nasa.gov> wrote: > Thanks Harsha. Yes, I know about the Fisher Callhome Corpus. There > is data related in there that can be used for sentiment analysis :) > It can be adapted and is being used for that. > > Anyways, yes looking forward to the task. Please send in your proposal > Madhawa. > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Chris Mattmann, Ph.D. > Chief Architect > Instrument Software and Science Data Systems Section (398) > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > Office: 168-519, Mailstop: 168-527 > Email: chris.a.mattm...@nasa.gov > WWW: http://sunset.usc.edu/~mattmann/ > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Director, Information Retrieval and Data Science Group (IRDS) > Adjunct Associate Professor, Computer Science Department > University of Southern California, Los Angeles, CA 90089 USA > WWW: http://irds.usc.edu/ > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > > > > -----Original Message----- > From: Harshavardhan Manjunatha <hmanj...@usc.edu> > Date: Friday, March 25, 2016 at 2:45 PM > To: jpluser <chris.a.mattm...@jpl.nasa.gov> > Cc: "dev@opennlp.apache.org" <dev@opennlp.apache.org>, Information and > Data Science Group USC List <ird...@mymaillists.usc.edu>, > "kamal...@usc.edu" <kamal...@usc.edu>, "d...@tika.apache.org" > <d...@tika.apache.org> > Subject: Re: GSOC2016 Sentiment Analysis > > >Dear Prof. Mattmann, > > > > > >Thanks. But the Fisher Callhome Corpus is a training Corpus for Machine > >Translation b/w Spanish & Englosh. > > > > > >I dont think it can be adapted to Sentiment Analysis. > > > > > >Developing a generic training model/corpus for Sentiment Analysis that > >encapsulates social media, movie reviews, etc, etc will be a Challenging > >& Exciting Task !! > > > > > >Regards, > >Harsha > > > > > >On Fri, Mar 25, 2016 at 2:42 PM, Mattmann, Chris A (3980) > ><chris.a.mattm...@jpl.nasa.gov> wrote: > > > >Sounds great Harsha. This is for Google Summer of Code, so collaborating > >would be great, and in this case, we would be working with Madhawa, should > >he choose to accept. > > > >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >Chris Mattmann, Ph.D. > >Chief Architect > >Instrument Software and Science Data Systems Section (398) > >NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > >Office: 168-519, Mailstop: 168-527 > >Email: chris.a.mattm...@nasa.gov > >WWW: > >http://sunset.usc.edu/~mattmann/ <http://sunset.usc.edu/~mattmann/> > >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >Director, Information Retrieval and Data Science Group (IRDS) > >Adjunct Associate Professor, Computer Science Department > >University of Southern California, Los Angeles, CA 90089 USA > >WWW: http://irds.usc.edu/ > >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > > > > > > > > > >-----Original Message----- > >From: Harshavardhan Manjunatha <hmanj...@usc.edu> > >Date: Friday, March 25, 2016 at 2:38 PM > >To: jpluser <chris.a.mattm...@jpl.nasa.gov> > >Cc: "dev@opennlp.apache.org" <dev@opennlp.apache.org>, Information and > >Data Science Group USC List <ird...@mymaillists.usc.edu>, > >"kamal...@usc.edu" <kamal...@usc.edu>, "d...@tika.apache.org" > ><d...@tika.apache.org> > >Subject: Re: GSOC2016 Sentiment Analysis > > > >>Dear Prof. Mattmann, > >> > >> > >>I would love to collaborate on this & am interested in developing > >>Sentiment Analysis Tika Parsers leveraging Apache OpenNLP. > >> > >> > >>I have completed an Applied NLP course @ USC. > >> > >> > >>I have done a Literature Review of Papers & Open Source Tools on the same > >>recently. > >> > >> > >>Regards, > >>Harsha > >> > >> > >>On Fri, Mar 25, 2016 at 2:07 PM, Mattmann, Chris A (3980) > >><chris.a.mattm...@jpl.nasa.gov> wrote: > >> > >>Hi Madhawa, > >> > >> > >> > >>So, how about a project that develops and contributes an Apache > >> > >>Tika and OpenNLP based SentimentAnalysisParser? > >> > >> > >> > >>I have some students currently doing work using the Fisher Callhome > >> > >>Corpus and you can build off that. I am CC’ing my USC IRDS team > >> > >>and my student Indhu who is working on this. > >> > >> > >> > >>Can you start working on your proposal by: > >> > >> > >> > >>1. Creating a JIRA issue here: > >> > >> > https://urldefense.proofpoint.com/v2/url?u=http-3A__issues.apache.org_jir > >>a > >>_browse_TIKA&d=CwIGaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=8l5 > >>6 > >>W6EU8xpHKOeTqpG03w&m=FEfICxmcDheHndXqky_rLNiYMcQE9yeOn7RoOwpR8t0&s=BPBK1m > >>s > >>1hzt9Tb5RdkU5B7FqRxuyMu3BoROpgd8Tvdw&e= > >> > >> tag it with ‘gsoc2016’, ‘memex’, and ‘irds’ please > >> > >> > >> > >>2. Develop a proposal on the Tika wiki here: > >> > >> > https://urldefense.proofpoint.com/v2/url?u=http-3A__wiki.apache.org_tika_ > >>G > >>SoC2016&d=CwIGaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=8l56W6EU > >>8 > >>xpHKOeTqpG03w&m=FEfICxmcDheHndXqky_rLNiYMcQE9yeOn7RoOwpR8t0&s=GGQdxogPSoN > >>h > >>rlr5mALyeK4Jkn7og7u5K0Mr6qGuQ1s&e= > >> (you will need permission, first > >> > >>sign up for your account on the wiki then tell me your username so I > >> > >>can add permissions for you) > >> > >> > >> > >>3. Apply through the Google Summer of Code 2016 program. > >> > >> > >> > >>4. Get in touch with me, and Indhu, and keep dev@tika.a.o and > >> > >>dev@openlp.a.o and ird...@usc.edu in the loop so we can discuss together > >> > >>as a community. > >> > >> > >> > >>Cool? > >> > >> > >> > >>Cheers, > >> > >>Chris > >> > >> > >> > >>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >> > >>Chris Mattmann, Ph.D. > >> > >>Chief Architect > >> > >>Instrument Software and Science Data Systems Section (398) > >> > >>NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > >> > >>Office: 168-519, Mailstop: 168-527 > >> > >>Email: chris.a.mattm...@nasa.gov > >> > >>WWW: > > > > > >>http://sunset.usc.edu/~mattmann/ <http://sunset.usc.edu/~mattmann/> > >> > >>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >> > >>Director, Information Retrieval and Data Science Group (IRDS) > >> > >>Adjunct Associate Professor, Computer Science Department > >> > >>University of Southern California, Los Angeles, CA 90089 USA > >> > >>WWW: http://irds.usc.edu/ > >> > >>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >>-----Original Message----- > >> > >>From: Madhawa Kasun Gunasekara <madhaw...@gmail.com> > >> > >>Reply-To: "dev@opennlp.apache.org" <dev@opennlp.apache.org> > >> > >>Date: Wednesday, March 16, 2016 at 10:51 PM > >> > >>To: "dev@opennlp.apache.org" <dev@opennlp.apache.org> > >> > >>Subject: GSOC2016 Sentiment Analysis > >> > >> > >> > >>>Hi > >> > >>> > >> > >>>I am interesting on contribute to OPENNLP-840: "Sentiment Analysis" for > >> > >>>GSOC2016 this time. Since i have been engaging with some similar > >>>projects > >> > >>>i > >> > >>>think it will be a great experience for me. > >> > >>> > >> > >>>I am a final year student in IESL College of Engineering, Sri lanka. I > >> > >>>have > >> > >>>learned machine learning and natural language processing stuff when I'm > >> > >>>doing my first degree (Computer Science) in University of Sri > >> > >>>Jayewardhenapura. > >> > >>> > >> > >>>In my internship period, I have actively contributed to a Twitter based > >> > >>>NLP > >> > >>>project. and We have published an article on IEEE Conference, "Real-time > >> > >>>Natural Language Processing for Crowdsourced Road Traffic Alerts" [2] . > >> > >>> > >> > >>>Please let me know what you think and what you suggest. > >> > >>> > >> > >>>Please kindly give me further information on how I could proceed. I > >> > >>>couldn't able to find the mentioned paper "Multi-Class Sentiment > >>>Analysis > >> > >>>in Twitter: a Pattern-Based Approach" > >> > >>>[1] > >> > https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_ji > >>r > >>a_browse_OPENNLP-2D840&d=CwIGaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfn > >>c > >>_gI&r=8l56W6EU8xpHKOeTqpG03w&m=FEfICxmcDheHndXqky_rLNiYMcQE9yeOn7RoOwpR8t > >>0 > >>&s=p9CPiDKtrgF3BYZ8nLSWUXFncDjBBYV2ejUW4wPXtCY&e= > > > > > >>< > https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_j > >>i > >>ra_browse_OPENNLP-2D840&d=CwIGaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSf > >>n > >>c_gI&r=8l56W6EU8xpHKOeTqpG03w&m=FEfICxmcDheHndXqky_rLNiYMcQE9yeOn7RoOwpR8 > >>t > >>0&s=p9CPiDKtrgF3BYZ8nLSWUXFncDjBBYV2ejUW4wPXtCY&e=> > >> > >>>[2] > >> > https://urldefense.proofpoint.com/v2/url?u=http-3A__ieeexplore.ieee.org_x > >>p > >>l_articleDetails.jsp-3Farnumber-3D7377667&d=CwIGaQ&c=clK7kQUTWtAVEOVIgvi0 > >>N > >>U5BOUHhpN0H8p7CSfnc_gI&r=8l56W6EU8xpHKOeTqpG03w&m=FEfICxmcDheHndXqky_rLNi > >>Y > >>McQE9yeOn7RoOwpR8t0&s=V6EFcS7WaMwDxGZ5Ttm5-f-UTMLBlmIIBgkJYHB7P1w&e= > >>< > https://urldefense.proofpoint.com/v2/url?u=http-3A__ieeexplore.ieee.org_ > >>x > >>pl_articleDetails.jsp-3Farnumber-3D7377667&d=CwIGaQ&c=clK7kQUTWtAVEOVIgvi > >>0 > >>NU5BOUHhpN0H8p7CSfnc_gI&r=8l56W6EU8xpHKOeTqpG03w&m=FEfICxmcDheHndXqky_rLN > >>i > >>YMcQE9yeOn7RoOwpR8t0&s=V6EFcS7WaMwDxGZ5Ttm5-f-UTMLBlmIIBgkJYHB7P1w&e=> > >> > >>> > >> > >>>Thanks > >> > >>>Madhawa Gunasekara > >> > >> > >> > >> > >> > >> > >> > >> > > > > > > > > > > > > > >