On 07/26/2012 09:20 PM, Steve Armstrong wrote:
Do you mean I need to deploy the mahout jars to the lib directory of
the master node? Or all the data nodes? Or is there a way to simply
tell the hadoop job launcher to upload the jars itself?

Every node that runs a Task (mapper or reducer) needs access to your libraries.

There are ways to tell Hadoop to use JARs on the "distributed classpath" in HDFS, yes. But I think most people find it simplest to create a "fat JAR" with something like Maven's shade plugin that contains everything the mappers and reducers need. Then you just need to hand that one library around. I'd suggest starting with this simple approach just to get your stuff working, and then go back to investigate things like the distributed classpath.

hth

Reply via email to