Hello everyone, sorry to bring this up again but I need some clarification. I wrote a map-reduce application that need cloud9 library (https://github.com/lintool/Cloud9). This library is packet in a jar file and I want to make it available to the whole cluster. So far I have been working in standalone mode and I have unsuccessfully tried to use the -libjars options. I always get ClassNotDefException: the only way I made everything work fine is by copying the cloud9.jar into hadoop/lib folder. I suppose I cannot do it when using a cluster of N machines since I would have to copy it on the N machines and this approach isn't feasible.
Here's how I perform the job "hadoop jar myjob.jar myjob.driver.PreprocessANC -libjars ../umd-hadoop-core/cloud9.jar home/my/pyworkspace/openAnc.xml index/ 10 1" Is there some code that needs to be written in the driver in order to have the darn library added to the "global" classpath? This -libjars option is really poor documented IMHO. Any help would be very much appreciated ;) Marco Didonna On 17 August 2011 03:57, Anty <anty....@gmail.com> wrote: > Thanks very much , todd. I get it. > > > On Wed, Aug 17, 2011 at 6:23 AM, Todd Lipcon <t...@cloudera.com> wrote: >> Putting files on the classpath doesn't make them accessible to JVM's >> resource loader. If you have dir/foo.properties, then "dir" needs to >> be on the classpath, not "dir/foo.properties". Since the working dir >> of the task is on the classpath, then -files works since it gets the >> properties file into a directory on the classpath. >> >> -Todd >> >> On Mon, Aug 15, 2011 at 8:09 PM, Anty <anty....@gmail.com> wrote: >>> thanks very much for you reply, todd. >>> I am at a complete loss. I want to ship a configuration file to the >>> cluster to run my mapreduce job. >>> >>> if I use -libjars option to ship the configuration file, the launched >>> child JVM created by task tracker >>> can't find the configuration file,curiously, the configuration file >>> is already on the classpath of the child JVM. >>> >>> if I use -files option to ship the configuration file, the child JVM >>> can find the file. >>> IMO, what's the difference between -libjars and -files is that -files >>> will create a symbol sink to the configuration file >>> in current workding directory of child JVM. >>> >>> I dig into the source code,but it's so complicated, i can't figure out >>> the root cause of this. >>> So my question is : >>> with -libjars option ,the configuration file is already on the >>> classpath, why classload can't the configuration file , >>> but why JVM classload CAN find the shipped jar with -libjars option? >>> >>> any help will be appreciated. >>> >>> On Tue, Aug 16, 2011 at 1:06 AM, Todd Lipcon <t...@cloudera.com> wrote: >>>> Your "driver" is the program that submits the job. The task is the >>>> thing that runs on the cluster. They have separate classpaths. >>>> >>>> Better to ask on the public lists if you want a more indepth explanation >>>> >>>> -Todd >>>> >>>> On Mon, Aug 15, 2011 at 9:02 AM, Anty <anty....@gmail.com> wrote: >>>>> Hi:Todd >>>>> Would you please explain a litter more? >>>>> >>>>> On Sat, Dec 11, 2010 at 2:08 AM, Todd Lipcon <t...@cloudera.com> wrote: >>>>>> >>>>>> You need to put the library jar on your classpath (eg using >>>>>> HADOOP_CLASSPATH) as well. The -libjars will ship it to the cluster >>>>>> and put it on the classpath of your task, but not the classpath of >>>>>> your "driver" code. >>>>>> >>>>> I still can't understand you mean by " but not the classpath of >>>>> your "driver" code." >>>>> >>>>> THX advance. >>>>> >>>>> >>>>>> -Todd >>>>>> >>>>>> On Thu, Dec 9, 2010 at 10:29 PM, Vipul Pandey <vipan...@gmail.com> wrote: >>>>>> > disclaimer : a newbie!!! >>>>>> > Howdy? >>>>>> > Got a quick question. -libjars option doesn't seem to work for me in - >>>>>> > prettymuch - my first (or mayby second) mapreduce job. >>>>>> > Here's what i'm doing : >>>>>> > $bin/hadoop jar sherlock.jar somepkg.FindSchoolsJob -libjars >>>>>> > HStats-1A18.jar input output >>>>>> > >>>>>> > sherlock.jar has my main class (ofcourse) FindSchoolsJob, which runs >>>>>> > just >>>>>> > fine by itself till I add a dependency on a class in HStats-1A18.jar. >>>>>> > When I run the above command with -libjars specified - it fails to find >>>>>> > my >>>>>> > classes that 'are' inside HStats jar file. >>>>>> > Exception in thread "main" java.lang.NoClassDefFoundError: >>>>>> > com/*****/HAgent >>>>>> > at com.*****.FindSchoolsJob.run(FindSchoolsJob.java:46) >>>>>> > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) >>>>>> > at com.******.FindSchoolsJob.main(FindSchoolsJob.java:101) >>>>>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>>>>> > at >>>>>> > >>>>>> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >>>>>> > at >>>>>> > >>>>>> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >>>>>> > at java.lang.reflect.Method.invoke(Method.java:597) >>>>>> > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) >>>>>> > Caused by: java.lang.ClassNotFoundException:com/*****/HAgent >>>>>> > at java.net.URLClassLoader$1.run(URLClassLoader.java:202) >>>>>> > at java.security.AccessController.doPrivileged(Native Method) >>>>>> > at java.net.URLClassLoader.findClass(URLClassLoader.java:190) >>>>>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:307) >>>>>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:248) >>>>>> > ... 8 more >>>>>> > >>>>>> > My main class is defined as below : >>>>>> > public class FindSchoolsJob extends Configured implements Tool { >>>>>> > : >>>>>> > public int run(String[] args) throws Exception { >>>>>> > : >>>>>> > : >>>>>> > } >>>>>> > : >>>>>> > public static void main(String[] args) throws Exception { >>>>>> > int res = ToolRunner.run(new Configuration(), new FindSchoolsJob(), >>>>>> > args); >>>>>> > System.exit(res); >>>>>> > } >>>>>> > } >>>>>> > Any hint would be highly appreciated. >>>>>> > Thank You! >>>>>> > ~V >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Todd Lipcon >>>>>> Software Engineer, Cloudera >>>>> >>>>> >>>>> >>>>> -- >>>>> Best Regards >>>>> Anty Rao >>>>> >>>> >>>> >>>> >>>> -- >>>> Todd Lipcon >>>> Software Engineer, Cloudera >>>> >>> >>> >>> >>> -- >>> Best Regards >>> Anty Rao >>> >> >> >> >> -- >> Todd Lipcon >> Software Engineer, Cloudera >> > > > > -- > Best Regards > Anty Rao >