Hello,
1) I have implemented by me BSP job which includes external library as a jar
file. I tried to run this job on my cluster and got error in log
NoClassDefFoundError with suggestion to use BSPJob#setJar method.
I packaged my BSP job and every required libraries into one jar file, so I
have one jar with all dependencies.
I did it, so in my code I've added bsp.setJar("path_to_jar") and on every
node I put my jar in the same directory, so the path is the same.
Now I got similar error:
11/02/07 14:53:44 WARN bsp.GroomServer: Error running child
java.lang.NoClassDefFoundError:
org/apache/commons/collections15/Factory // this class is
provided in the external jar
at
com.ibm.hama.algorithms.PageRank$PageRankBSP.setConf(PageRank.java:95)
...
The exception is raised when I'm trying to get my variable from
configuration object:
public void setConf(Configuration conf) {
this.conf = conf;
here -> storageDir = conf.get(STORAGE_DIR);
}
It looks really strange because error is related to missing Factory class
definition but in this line of code I don't use Factory. I seems that Hama
does not load whole jar file on nodes. The question is how to use external
jars with hama?
2) I need to perform many supersteps on each Hama node. If I understand
correctly, each superstep is related to one call bsp() method, am I right ?
How does the bsp job and related process live ? I want to share some results
between supersteps, how should I do that? How many times new object from BSP
class is created on Hama node - ones per superstep or maybe ones per whole
job ?
Cheers,
Pawel