Hi, My name is Mohit Bagde. I am currently doing my Master's in CS at USC. I have taken CS572 Information Retrieval and Search Engines under Prof. Mattmann and as have worked on Nutch 1.X as part of the first assignment which involved crawling with Nutch and integrating with Tika and subsequently developing a plugin in Nutch. I have also taken INF 550 under Prof. Kim where I am learning about the HDFS and Map Reduce and I find that both these subjects have a common point in the JIRA issue NUTCH-1936 which is about porting Nutch to Hadoop 2.X.
My questions are, I would like to know on a very high level, what the requirements for this project are? And what kind of background is required? I would like to submit a project proposal but I am not entirely sure what to put into it. I enjoyed working with Nutch and found the entire experience to be very knowledgeable. I would like to continue to develop and contribute to Nutch in any which way possible. I would be really obliged if you could give some more insight into this JIRA issue. Sincerely, Mohit Bagde. On Tue, Mar 10, 2015 at 9:54 PM, Ashwini Tokekar (JIRA) <j...@apache.org> wrote: > > [ > https://issues.apache.org/jira/browse/NUTCH-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel > ] > > Ashwini Tokekar updated NUTCH-1936: > ----------------------------------- > Comment: was deleted > > (was: Thanks Lewis) > > > GSoC 2015 - Move Nutch to Hadoop 2.X > > ------------------------------------ > > > > Key: NUTCH-1936 > > URL: https://issues.apache.org/jira/browse/NUTCH-1936 > > Project: Nutch > > Issue Type: Task > > Components: build > > Reporter: Lewis John McGibbney > > Labels: gsoc2015 > > Fix For: 2.4, 1.11 > > > > > > The Nutch PMC [discussed| > http://www.mail-archive.com/dev%40nutch.apache.org/msg16250.html] ideas > for a good 2015 GSoC project. It appears that porting the (trunk) codebase > to [Hadoop 2.X|http://hadoop.apache.org/docs/stable/] seems to an > attractive option and one which would present an excellent learning > experience for a summer student. > > A more comprehensive description of this issue should be included within > either a mentor-defined project description or a successful student > application. > > > > -- > This message was sent by Atlassian JIRA > (v6.3.4#6332) > -- Mohit Bagde Graduate Student, Computer Science, University of Southern California, Los Angeles, CA 90007.