[ 
https://issues.apache.org/jira/browse/NUTCH-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated NUTCH-1936:
----------------------------------------
    Attachment: NUTCH-1939.patch

Prelim patch which folks can try out.
N.B. tests fail with IOException RE: failure to load specific mapred-site.xml 
properties.
I am not sure that all API upgrades are done 100% properly however this is an 
effort for us to upgrade to 2.X.
I need to admit, I've pegged dependencies at 2.4.0 simply because this is what 
EMR uses... and right now we are using EMR for crawls. This is nothing bias 
from me, it is merely my observation that both client and server should be 
using the same. I understand that this is not adequate for everyone.
[~mjoyce]

> GSoC 2015 - Move Nutch to Hadoop 2.X
> ------------------------------------
>
>                 Key: NUTCH-1936
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1936
>             Project: Nutch
>          Issue Type: Task
>          Components: build
>            Reporter: Lewis John McGibbney
>            Assignee: Lewis John McGibbney
>              Labels: gsoc2015
>             Fix For: 2.4, 1.11
>
>         Attachments: NUTCH-1939.patch
>
>
> The Nutch PMC 
> [discussed|http://www.mail-archive.com/dev%40nutch.apache.org/msg16250.html] 
> ideas for a good 2015 GSoC project. It appears that porting the (trunk) 
> codebase to [Hadoop 2.X|http://hadoop.apache.org/docs/stable/] seems to an 
> attractive option and one which would present an excellent learning 
> experience for a summer student.
> A more comprehensive description of this issue should be included within 
> either a mentor-defined project description or a successful student 
> application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to