[
https://issues.apache.org/jira/browse/MAHOUT-708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sean Owen resolved MAHOUT-708.
------------------------------
Resolution: Fixed
So, updating to 0.20.203.0 was almost painless. There were two problems.
Hadoop 0.20.203.0 depends on Jackson from Codehaus, but doesn't declare it in
the POM. So we had to add that dependency manually.
And, it also writes _SUCCESS files in output dirs. Many bits of code and tests
still didn't correctly filter these out. this was easy to fix. Incidentally
this ought to fix some problems people see on CDH, which has the same behavior.
I should stress that the resulting code is still entirely compatible with
Hadoop 0.20.2. We haven't really upped requirements.
> Update to Hadoop 0.20.203.0
> ---------------------------
>
> Key: MAHOUT-708
> URL: https://issues.apache.org/jira/browse/MAHOUT-708
> Project: Mahout
> Issue Type: Task
> Components: Classification, Clustering, Collaborative Filtering,
> Frequent Itemset/Association Rule Mining
> Affects Versions: 0.5
> Reporter: Sean Owen
> Assignee: Sean Owen
> Labels: hadoop
> Fix For: 0.6
>
>
> I suggest we should move to Hadoop 0.20.203.0 for the next release. (Not 0.21
> or further.) It is a much more recent branch of 0.20.x and is compile-time
> compatible with 0.20.2 in our code already.
> However I know already that switching to it causes some failures, in the
> Lanczos jobs for instances. Looks like something's expecting a file somewhere
> that isn't where it used to be. I bet it's an easy fix, but don't know what
> it is yet.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira