[
https://issues.apache.org/jira/browse/MAHOUT-708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037462#comment-13037462
]
Sean Owen commented on MAHOUT-708:
----------------------------------
Let's sit on this a while longer then. We should at least get onto 0.20.203.
Yes, it brings back joins and multiple outputs, which is 80% of the reason we'd
want it.
I'm on 0.22 and it makes it possible to build a recommender pipeline half as
complex and about 5x as fast. It's big for some machine learning apps.
> Update to Hadoop 0.21
> ---------------------
>
> Key: MAHOUT-708
> URL: https://issues.apache.org/jira/browse/MAHOUT-708
> Project: Mahout
> Issue Type: Task
> Components: Classification, Clustering, Collaborative Filtering,
> Frequent Itemset/Association Rule Mining
> Affects Versions: 0.5
> Reporter: Sean Owen
> Assignee: Sean Owen
> Labels: hadoop
> Fix For: 0.6
>
>
> I suggest we should move to Hadoop 0.21 for the next release. It is the
> current release, soon to be superseded by 0.22. It matches more closely what
> CDH3/4 users use. It has bug fixes, and crucially some features that make
> joins much less painful.
> The drawback is that EMR does not yet support it. I still suggest we forge
> ahead as one imagines it will be supported by the time we release 0.6.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira