1. Wrap all your jar files inside your artifact, they should be under
lib folder. Sometimes this could make your jar file quite big, if you
want to save time uploading big jar files remotely, see 2
2. Use -libjars with full path or relative path (w.r.t. your jar
package) should work
On 3
Hi Jane
+ Adding on to Joey's comments
If you want to eliminate the process of distributing the dependent
jars every time, then you need to manually pre-distribute these jars across
the nodes and add them on to the classpath of all nodes. This approach may
be chosen if you are periodically runnin
Pe 06.03.2012 17:37, Jane Wayne a scris:
currently, i have my main jar and then 2 depedent jars. what i do is
1. copy dependent-1.jar to $HADOOP/lib
2. copy dependent-2.jar to $HADOOP/lib
then, when i need to run my job, MyJob inside main.jar, i do the following.
hadoop jar main.jar demo.MyJob
If you're using -libjars, there's no reason to copy the jars into
$HADOOP lib. You may have to add the jars to the HADOOP_CLASSPATH if
you use them from your main() method:
export HADOOP_CLASSPATH=dependent-1.jar,dependent-2.jar
hadoop jar main.jar demo.MyJob -libjars
dependent-1.jar,dependent-2.j
currently, i have my main jar and then 2 depedent jars. what i do is
1. copy dependent-1.jar to $HADOOP/lib
2. copy dependent-2.jar to $HADOOP/lib
then, when i need to run my job, MyJob inside main.jar, i do the following.
hadoop jar main.jar demo.MyJob -libjars dependent-1.jar,dependent-2.jar
-D