This looks much more in-depth and helpful. I think your best next step is to, if you haven't already, connect with potential mentors and indicate who those folks are within your proposal.
-Greg ___________ Sent from my iPad. Apologies for any typos. A more detailed response may be sent later. On Apr 4, 2012, at 10:31 AM, karthik prasad <karthikprasad...@gmail.com> wrote: > Dear Sirs, > I am grateful for your valuable feedback and suggestions. > > I have updated my proposal based on the inputs given by you. The split-up > of the deliverables on the ideas page indeed helped me understand the > requirements more clearly. > > The link to my updated proposal is > https://www.mediawiki.org/wiki/User:Karthikprasad/gsoc2012proposal > > I request you and everyone to kindly skim through my proposal once again > and suggest changes/additions. > I am very excited about this project and working with you; and truth be > told, 23rd April seems like ages ahead. > > Thanking you, > Yours sincerely, > Karthik > > >> Date: Wed, 4 Apr 2012 11:49:41 +0200 >> From: "Oren Bochman" <orenboch...@gmail.com> >> To: "'Wikimedia developers'" <wikitech-l@lists.wikimedia.org> >> Subject: Re: [Wikitech-l] GSoC 2012: Proposal-Wikipedia Corpus Tools >> Message-ID: <007f01cd1248$42ee6f40$c8cb4dc0$@com> >> Content-Type: text/plain; charset="utf-8" >> >> You do understand correctly! >> >> The main idea about NLP components is with POS tagger as an example: >> >> 1. a fall back system that does unsupervised POS tagging. >> 2. the ability to plug in an existing POS tagger as these become >> available for specific languages. >> >> I would as supervisor would recommend working with 3 languages. >> English, Hebrew, and the GSOC native language. >> >> If we could get QA from other native speakers we would incorporate them >> into the workflow. >> >> I think that by using a deletion/reversion based heuristic we may also be >> able to make a spam corpus to boost the accuracy of the corpuses. >> >> >> Operation Manager >> E-mail: o...@romai-horizon.com >> Mobil: +36 30 866 6706 >> >> >> >> R?mai Horizon Kft. >> H-1039 Budapest >> Kir?lyok ?tja 291. D. ?p. fszt. 2. >> Tel: +36 1 492 1492 >> Fax: +36 1 266 5529 >> >> -----Original Message----- >> From: wikitech-l-boun...@lists.wikimedia.org [mailto: >> wikitech-l-boun...@lists.wikimedia.org] On Behalf Of Amir E. Aharoni >> Sent: Tuesday, April 03, 2012 10:19 PM >> To: Wikimedia developers >> Subject: Re: [Wikitech-l] GSoC 2012: Proposal-Wikipedia Corpus Tools >> >> 2012/4/3 karthik prasad <karthikprasad...@gmail.com>: >>> Hello, >>> I am a GSoC aspirant and have compiled a proposal for one of the >>> project ideas - Wikipedia Corpus Tools. [Mentor : Oren Bochman] I >>> would sincerely appreciate if you could kindly go through it and >>> suggest corrections/additions so that I can settle with a coherent >> proposal. >>> >>> Link to my proposal : >>> https://www.mediawiki.org/wiki/User:Karthikprasad/gsoc2012proposal >> >> Nice, but why only English? >> >> If i understand the proposal correctly, this project is supposed to be >> able to work with almost any language with very little effort. >> >> -- >> Amir Elisha Aharoni ? ?????? ????????? ?????????? >> http://aharoni.wordpress.com ??We're living in pieces, I want to live in >> peace.? ? T. Moore? >> >> _______________________________________________ >> Wikitech-l mailing list >> Wikitech-l@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/wikitech-l >> >> >> >> >> ------------------------------ >> >> >> Date: Wed, 4 Apr 2012 12:58:11 +0300 >> From: "Amir E. Aharoni" <amir.ahar...@mail.huji.ac.il> >> To: Wikimedia developers <wikitech-l@lists.wikimedia.org> >> Subject: Re: [Wikitech-l] GSoC 2012: Proposal-Wikipedia Corpus Tools >> Message-ID: >> <CACtNa8tS-PifzJS1JsF02k3qW_-7=uk-wdqnvsflglufhxn...@mail.gmail.com >>> >> Content-Type: text/plain; charset=UTF-8 >> >> 2012/4/4 Oren Bochman <orenboch...@gmail.com>: >>> You do understand correctly! >>> >>> The main idea about NLP components is with POS tagger as an example: >> >> Just to make sure, POS = part of speech, isn't it? >> >> It's one of the most confusing TLAs in computing :) >> >>> If we could get QA from other native speakers we would incorporate them >> into the workflow. >> >> Good. As long as there is a way to plug other languages and a way for >> speakers of other languages to contribute QA, i'm very happy. >> >> -- >> Amir Elisha Aharoni ? ?????? ????????? ?????????? >> http://aharoni.wordpress.com >> ??We're living in pieces, >> I want to live in peace.? ? T. Moore? >> > > > Date: Wed, 4 Apr 2012 00:28:29 -0400 > From: Gregory Varnum <gregory.var...@gmail.com> > To: Wikimedia developers <wikitech-l@lists.wikimedia.org> > Subject: Re: [Wikitech-l] GSoC 2012: Proposal-Wikipedia Corpus Tools > Message-ID: <ac4c429f-a839-4911-be9b-c8928aa2d...@gmail.com> > Content-Type: text/plain; charset=utf-8 > > Whoops - I meant that email to be directed to Karthik - although Amir > you're welcome to read it as well. :) > > -greg > > > On Apr 3, 2012, at 11:24 PM, Gregory Varnum <gregory.var...@gmail.com> > wrote: > >> Amir, >> >> Thank you for your GSOC proposal! :) >> >> Between now and Google's submission deadline on April 6th - you are > invited to further modify your proposals. The GSOC page on MW.org - > https://www.mediawiki.org/wiki/GSOC - and our IRC rooms - > https://www.mediawiki.org/wiki/MediaWiki_on_IRC >> >> Looking over your proposal - I think you've got good background > information on yourself. However, I think you should flush out more > details on the proposed project. Without more familiarity with corpus (and > with no links to find that info) - it's hard for everyone to weigh in > equally or to make sure your project gets the full consideration you'd like. >> >> -greg aka varnent >> >> >> On Apr 3, 2012, at 4:18 PM, Amir E. Aharoni <amir.ahar...@mail.huji.ac.il> > wrote: >> >>> 2012/4/3 karthik prasad <karthikprasad...@gmail.com>: >>>> Hello, >>>> I am a GSoC aspirant and have compiled a proposal for one of the project >>>> ideas - Wikipedia Corpus Tools. [Mentor : Oren Bochman] >>>> I would sincerely appreciate if you could kindly go through it and > suggest >>>> corrections/additions so that I can settle with a coherent proposal. >>>> >>>> Link to my proposal : >>>> https://www.mediawiki.org/wiki/User:Karthikprasad/gsoc2012proposal >>> >>> Nice, but why only English? >>> >>> If i understand the proposal correctly, this project is supposed to be >>> able to work with almost any language with very little effort. >>> >>> -- >>> Amir Elisha Aharoni ? ?????? ????????? ?????????? >>> http://aharoni.wordpress.com >>> ??We're living in pieces, >>> I want to live in peace.? ? T. Moore? >>> >>> _______________________________________________ >>> Wikitech-l mailing list >>> Wikitech-l@lists.wikimedia.org >>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l >> > _______________________________________________ > Wikitech-l mailing list > Wikitech-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikitech-l _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l