I apologize Sean I wasn't aware of the complete history in this thread. I didn't know about Hadoop 2.x being involved here, if so yes need to build Mahout against HEAD with Hadoop 2 profile to get working.
On Wednesday, March 5, 2014 12:04 PM, Sean Owen <sro...@gmail.com> wrote: CDH 4.5 and 4.6 are both 0.7 + patches. Neither contains 0.8, since it has (tiny) breaking changes vs 0.7 and this is a minor version update. CDH5 contains 0.8 + patches. I did not say CDH4 has 0.8 -- re-read the message of mine that was quoted. http://archive.cloudera.com/cdh4/cdh/4/mahout-0.7-cdh4.5.0.CHANGES.txt http://archive.cloudera.com/cdh4/cdh/4/mahout-0.7-cdh4.6.0.CHANGES.txt Those two patches are not in CDH 4.x, no. The non-upstream changes are basically all internal packaging stuff, and that can include modifying dependency versions to harmonize with the rest of the platform. That's the sense in which it works with Hadoop 2. I don't think the change you cite is sufficient to work with Hadoop 2. You also, for example, must build against the Hadoop 2 profile in Mahout in Maven. For that you do not need the CDH repo even, just point to the Hadoop 2.x release if you like. I know there has been a patch in even just the past few weeks that makes it work even better with 2.x. So I suppose I would build from HEAD if possible to take advantage. On Wed, Mar 5, 2014 at 4:30 PM, Suneel Marthi <suneel_mar...@yahoo.com> wrote: > Not sure if the CDH4 patches on top of 0.7 has fixes for M-1067 and M-1098 > which address the issues u r seeing. > > > > The second part of the issue u r seeing with Mahout 0.9 distro seems to be > related to how u set it up on CDH4. I apologize for not being helpful here as > I am not a CDH4 user or expert. > > Sean? >