I use the one big jar technique for regular hadoop and mahout jobs
because of these kinds of problems. I use the jarjar task with ant, I
expect Maven has something similar. I haven't had any of the class not
found problems since I started doing it.
On 05/09/2011 10:32 AM, Benson Margulies wrote:
So that explains how some user rebundlings don't work with us, sometimes.
What it doesn't explain is why running the regular, not-rebundled
"mahout-examples-0.5-SNAPSHOT-job.jar" via the bin/mahout shell
script is throwing this ClassNotFoundException for me (and it's happened
to Sean, and according to the list archives, others as well) in a production
cluster.
I agree that it doesn't explain. However, the code in hadoop that
implements this mechanism, well, if you ask me ... it STINKS. It
wouldn't surprise me if it fails in some case we haven't
characterized. This would argue for Sean's 'one big jar' approach.