Dear Prof. Mattmann, Thanks. But the Fisher Callhome Corpus is a training Corpus for Machine Translation b/w Spanish & Englosh.
I dont think it can be adapted to Sentiment Analysis. Developing a generic training model/corpus for Sentiment Analysis that encapsulates social media, movie reviews, etc, etc will be a Challenging & Exciting Task !! Regards, Harsha On Fri, Mar 25, 2016 at 2:42 PM, Mattmann, Chris A (3980) < chris.a.mattm...@jpl.nasa.gov> wrote: > Sounds great Harsha. This is for Google Summer of Code, so collaborating > would be great, and in this case, we would be working with Madhawa, should > he choose to accept. > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Chris Mattmann, Ph.D. > Chief Architect > Instrument Software and Science Data Systems Section (398) > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > Office: 168-519, Mailstop: 168-527 > Email: chris.a.mattm...@nasa.gov > WWW: http://sunset.usc.edu/~mattmann/ > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Director, Information Retrieval and Data Science Group (IRDS) > Adjunct Associate Professor, Computer Science Department > University of Southern California, Los Angeles, CA 90089 USA > WWW: http://irds.usc.edu/ > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > > > > -----Original Message----- > From: Harshavardhan Manjunatha <hmanj...@usc.edu> > Date: Friday, March 25, 2016 at 2:38 PM > To: jpluser <chris.a.mattm...@jpl.nasa.gov> > Cc: "d...@opennlp.apache.org" <d...@opennlp.apache.org>, Information and > Data Science Group USC List <ird...@mymaillists.usc.edu>, > "kamal...@usc.edu" <kamal...@usc.edu>, "dev@tika.apache.org" > <dev@tika.apache.org> > Subject: Re: GSOC2016 Sentiment Analysis > > >Dear Prof. Mattmann, > > > > > >I would love to collaborate on this & am interested in developing > >Sentiment Analysis Tika Parsers leveraging Apache OpenNLP. > > > > > >I have completed an Applied NLP course @ USC. > > > > > >I have done a Literature Review of Papers & Open Source Tools on the same > >recently. > > > > > >Regards, > >Harsha > > > > > >On Fri, Mar 25, 2016 at 2:07 PM, Mattmann, Chris A (3980) > ><chris.a.mattm...@jpl.nasa.gov> wrote: > > > >Hi Madhawa, > > > > > > > >So, how about a project that develops and contributes an Apache > > > >Tika and OpenNLP based SentimentAnalysisParser? > > > > > > > >I have some students currently doing work using the Fisher Callhome > > > >Corpus and you can build off that. I am CC’ing my USC IRDS team > > > >and my student Indhu who is working on this. > > > > > > > >Can you start working on your proposal by: > > > > > > > >1. Creating a JIRA issue here: > > > > > https://urldefense.proofpoint.com/v2/url?u=http-3A__issues.apache.org_jira > >_browse_TIKA&d=CwIGaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=8l56 > >W6EU8xpHKOeTqpG03w&m=FEfICxmcDheHndXqky_rLNiYMcQE9yeOn7RoOwpR8t0&s=BPBK1ms > >1hzt9Tb5RdkU5B7FqRxuyMu3BoROpgd8Tvdw&e= > > > > tag it with ‘gsoc2016’, ‘memex’, and ‘irds’ please > > > > > > > >2. Develop a proposal on the Tika wiki here: > > > > > https://urldefense.proofpoint.com/v2/url?u=http-3A__wiki.apache.org_tika_G > >SoC2016&d=CwIGaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=8l56W6EU8 > >xpHKOeTqpG03w&m=FEfICxmcDheHndXqky_rLNiYMcQE9yeOn7RoOwpR8t0&s=GGQdxogPSoNh > >rlr5mALyeK4Jkn7og7u5K0Mr6qGuQ1s&e= > > (you will need permission, first > > > >sign up for your account on the wiki then tell me your username so I > > > >can add permissions for you) > > > > > > > >3. Apply through the Google Summer of Code 2016 program. > > > > > > > >4. Get in touch with me, and Indhu, and keep dev@tika.a.o and > > > >dev@openlp.a.o and ird...@usc.edu in the loop so we can discuss together > > > >as a community. > > > > > > > >Cool? > > > > > > > >Cheers, > > > >Chris > > > > > > > >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > >Chris Mattmann, Ph.D. > > > >Chief Architect > > > >Instrument Software and Science Data Systems Section (398) > > > >NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > > > >Office: 168-519, Mailstop: 168-527 > > > >Email: chris.a.mattm...@nasa.gov > > > >WWW: > >http://sunset.usc.edu/~mattmann/ <http://sunset.usc.edu/~mattmann/> > > > >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > >Director, Information Retrieval and Data Science Group (IRDS) > > > >Adjunct Associate Professor, Computer Science Department > > > >University of Southern California, Los Angeles, CA 90089 USA > > > >WWW: http://irds.usc.edu/ > > > >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > > > > > > > > > > > > > > > > > > > > > >-----Original Message----- > > > >From: Madhawa Kasun Gunasekara <madhaw...@gmail.com> > > > >Reply-To: "d...@opennlp.apache.org" <d...@opennlp.apache.org> > > > >Date: Wednesday, March 16, 2016 at 10:51 PM > > > >To: "d...@opennlp.apache.org" <d...@opennlp.apache.org> > > > >Subject: GSOC2016 Sentiment Analysis > > > > > > > >>Hi > > > >> > > > >>I am interesting on contribute to OPENNLP-840: "Sentiment Analysis" for > > > >>GSOC2016 this time. Since i have been engaging with some similar projects > > > >>i > > > >>think it will be a great experience for me. > > > >> > > > >>I am a final year student in IESL College of Engineering, Sri lanka. I > > > >>have > > > >>learned machine learning and natural language processing stuff when I'm > > > >>doing my first degree (Computer Science) in University of Sri > > > >>Jayewardhenapura. > > > >> > > > >>In my internship period, I have actively contributed to a Twitter based > > > >>NLP > > > >>project. and We have published an article on IEEE Conference, "Real-time > > > >>Natural Language Processing for Crowdsourced Road Traffic Alerts" [2] . > > > >> > > > >>Please let me know what you think and what you suggest. > > > >> > > > >>Please kindly give me further information on how I could proceed. I > > > >>couldn't able to find the mentioned paper "Multi-Class Sentiment Analysis > > > >>in Twitter: a Pattern-Based Approach" > > > >>[1] > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jir > >a_browse_OPENNLP-2D840&d=CwIGaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc > >_gI&r=8l56W6EU8xpHKOeTqpG03w&m=FEfICxmcDheHndXqky_rLNiYMcQE9yeOn7RoOwpR8t0 > >&s=p9CPiDKtrgF3BYZ8nLSWUXFncDjBBYV2ejUW4wPXtCY&e= > >< > https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_ji > >ra_browse_OPENNLP-2D840&d=CwIGaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfn > >c_gI&r=8l56W6EU8xpHKOeTqpG03w&m=FEfICxmcDheHndXqky_rLNiYMcQE9yeOn7RoOwpR8t > >0&s=p9CPiDKtrgF3BYZ8nLSWUXFncDjBBYV2ejUW4wPXtCY&e=> > > > >>[2] > > > https://urldefense.proofpoint.com/v2/url?u=http-3A__ieeexplore.ieee.org_xp > >l_articleDetails.jsp-3Farnumber-3D7377667&d=CwIGaQ&c=clK7kQUTWtAVEOVIgvi0N > >U5BOUHhpN0H8p7CSfnc_gI&r=8l56W6EU8xpHKOeTqpG03w&m=FEfICxmcDheHndXqky_rLNiY > >McQE9yeOn7RoOwpR8t0&s=V6EFcS7WaMwDxGZ5Ttm5-f-UTMLBlmIIBgkJYHB7P1w&e= > >< > https://urldefense.proofpoint.com/v2/url?u=http-3A__ieeexplore.ieee.org_x > >pl_articleDetails.jsp-3Farnumber-3D7377667&d=CwIGaQ&c=clK7kQUTWtAVEOVIgvi0 > >NU5BOUHhpN0H8p7CSfnc_gI&r=8l56W6EU8xpHKOeTqpG03w&m=FEfICxmcDheHndXqky_rLNi > >YMcQE9yeOn7RoOwpR8t0&s=V6EFcS7WaMwDxGZ5Ttm5-f-UTMLBlmIIBgkJYHB7P1w&e=> > > > >> > > > >>Thanks > > > >>Madhawa Gunasekara > > > > > > > > > > > > > > > > > >