I changed the main's to pass in the location of the jar, since the ANT task puts the jar in basedir/dist. I made a comment about it on Mahout-3. The Canopy driver should do the right thing????? I also did the same thing w/ the k-means.

I honestly don't think the JAR file must be specified as part of the JobConf. This is a hint, but it's a hint used only in very special cases (which I can't think of, to be honest). To my understanding, the situation is like this:

- When you assemble a job JAR, you should package it with all required dependencies under {jarfile.jar}/lib folder.

- All these classes are visible through context class loader set by Hadoop, so no special JAR tricks are required. When you submit a Hadoop job (remotely), you point to the JAR file with all dependencies and Hadoop can take it from there.

- When you run in-memory task tracker (for debugging or locally), all the classes should be available through normal classpath and context class loader (again) should resolve them successfully.

Can you enlighten me when pointing an explicit JAR file for JobConf is required?

Dawid

Reply via email to