[ 
https://issues.apache.org/jira/browse/NUTCH-634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12609295#action_12609295
 ] 

Andrzej Bialecki  commented on NUTCH-634:
-----------------------------------------

This issue will likely be fixed in Hadoop 0.19, until then we can work around 
this in Nutch by overriding Hadoop property hadoop.job.history.user.location 
and set it to e.g. ${hadoop.log.dir}/history/user . IMHO using special 
OutputFormat introduces more confusion and complicates the future upgrades ... 
Either way, this would have to be documented in the release notes.

I'd like to move forward on this issue in the next few days, if the solution I 
propose above seems acceptable - that is, to remove the use of special 
OutputFormats and add an override for that Hadoop property in nutch-default.xml

> Patch - Nutch - Hadoop 0.17.0
> -----------------------------
>
>                 Key: NUTCH-634
>                 URL: https://issues.apache.org/jira/browse/NUTCH-634
>             Project: Nutch
>          Issue Type: Improvement
>    Affects Versions: 0.9.0
>            Reporter: Michael Gottesman
>            Assignee: Andrzej Bialecki 
>             Fix For: 0.9.0
>
>         Attachments: diff, hadoop-0.17.patch, hadoop-0.17.patch
>
>
> This is a patch so that Nutch can be used with Hadoop 0.17.0. The patch is 
> located at http://pastie.org/212001
> The patch compiles and passes all current Nutch unit tests.
> I have tested that the crawler side of Nutch (i.e. inject, generate, fetch, 
> parse, merge w/crawldb) definetly works, but have not tested the lucene 
> indexing part. It might work, but it might not. 
> *NOTE* - the two main bugs that had to be overcome were not noticed by any of 
> the unit tests. The bugs only came up during actual testing. The bugs were:
> 1. Changes to the Hadoop Iterator
> 2. Addition of Serialization to MapReduce Framework

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to