Right now I am still in standalone mode ... I'd like to fix this issue before starting a cluster on EC2. :)
Thanks for your time Marco On 14 September 2011 14:04, Joey Echeverria <j...@cloudera.com> wrote: > When are you getting the exception? Is it during the setup of your > job, or after it's running on the cluster? > > -Joey > > On Wed, Sep 14, 2011 at 4:50 AM, Marco Didonna <m.didonn...@gmail.com> wrote: >> Hello everyone, >> sorry to bring this up again but I need some clarification. I wrote a >> map-reduce application that need cloud9 library >> (https://github.com/lintool/Cloud9). This library is packet in a jar >> file and I want to make it available to the whole cluster. So far I >> have been working in standalone mode and I have unsuccessfully tried >> to use the -libjars options. I always get ClassNotDefException: the >> only way I made everything work fine is by copying the cloud9.jar into >> hadoop/lib folder. >> I suppose I cannot do it when using a cluster of N machines since I >> would have to copy it on the N machines and this approach isn't >> feasible. >> >> Here's how I perform the job "hadoop jar myjob.jar >> myjob.driver.PreprocessANC -libjars ../umd-hadoop-core/cloud9.jar >> home/my/pyworkspace/openAnc.xml index/ 10 1" >> >> Is there some code that needs to be written in the driver in order to >> have the darn library added to the "global" classpath? This -libjars >> option is really poor documented IMHO. >> >> Any help would be very much appreciated ;) >> >> Marco Didonna >> >> On 17 August 2011 03:57, Anty <anty....@gmail.com> wrote: >>> Thanks very much , todd. I get it. >>> >>> >>> On Wed, Aug 17, 2011 at 6:23 AM, Todd Lipcon <t...@cloudera.com> wrote: >>>> Putting files on the classpath doesn't make them accessible to JVM's >>>> resource loader. If you have dir/foo.properties, then "dir" needs to >>>> be on the classpath, not "dir/foo.properties". Since the working dir >>>> of the task is on the classpath, then -files works since it gets the >>>> properties file into a directory on the classpath. >>>> >>>> -Todd >>>> >>>> On Mon, Aug 15, 2011 at 8:09 PM, Anty <anty....@gmail.com> wrote: >>>>> thanks very much for you reply, todd. >>>>> I am at a complete loss. I want to ship a configuration file to the >>>>> cluster to run my mapreduce job. >>>>> >>>>> if I use -libjars option to ship the configuration file, the launched >>>>> child JVM created by task tracker >>>>> can't find the configuration file,curiously, the configuration file >>>>> is already on the classpath of the child JVM. >>>>> >>>>> if I use -files option to ship the configuration file, the child JVM >>>>> can find the file. >>>>> IMO, what's the difference between -libjars and -files is that -files >>>>> will create a symbol sink to the configuration file >>>>> in current workding directory of child JVM. >>>>> >>>>> I dig into the source code,but it's so complicated, i can't figure out >>>>> the root cause of this. >>>>> So my question is : >>>>> with -libjars option ,the configuration file is already on the >>>>> classpath, why classload can't the configuration file , >>>>> but why JVM classload CAN find the shipped jar with -libjars option? >>>>> >>>>> any help will be appreciated. >>>>> >>>>> On Tue, Aug 16, 2011 at 1:06 AM, Todd Lipcon <t...@cloudera.com> wrote: >>>>>> Your "driver" is the program that submits the job. The task is the >>>>>> thing that runs on the cluster. They have separate classpaths. >>>>>> >>>>>> Better to ask on the public lists if you want a more indepth explanation >>>>>> >>>>>> -Todd >>>>>> >>>>>> On Mon, Aug 15, 2011 at 9:02 AM, Anty <anty....@gmail.com> wrote: >>>>>>> Hi:Todd >>>>>>> Would you please explain a litter more? >>>>>>> >>>>>>> On Sat, Dec 11, 2010 at 2:08 AM, Todd Lipcon <t...@cloudera.com> wrote: >>>>>>>> >>>>>>>> You need to put the library jar on your classpath (eg using >>>>>>>> HADOOP_CLASSPATH) as well. The -libjars will ship it to the cluster >>>>>>>> and put it on the classpath of your task, but not the classpath of >>>>>>>> your "driver" code. >>>>>>>> >>>>>>> I still can't understand you mean by " but not the classpath of >>>>>>> your "driver" code." >>>>>>> >>>>>>> THX advance. >>>>>>> >>>>>>> >>>>>>>> -Todd >>>>>>>> >>>>>>>> On Thu, Dec 9, 2010 at 10:29 PM, Vipul Pandey <vipan...@gmail.com> >>>>>>>> wrote: >>>>>>>> > disclaimer : a newbie!!! >>>>>>>> > Howdy? >>>>>>>> > Got a quick question. -libjars option doesn't seem to work for me in >>>>>>>> > - >>>>>>>> > prettymuch - my first (or mayby second) mapreduce job. >>>>>>>> > Here's what i'm doing : >>>>>>>> > $bin/hadoop jar sherlock.jar somepkg.FindSchoolsJob -libjars >>>>>>>> > HStats-1A18.jar input output >>>>>>>> > >>>>>>>> > sherlock.jar has my main class (ofcourse) FindSchoolsJob, which runs >>>>>>>> > just >>>>>>>> > fine by itself till I add a dependency on a class in HStats-1A18.jar. >>>>>>>> > When I run the above command with -libjars specified - it fails to >>>>>>>> > find >>>>>>>> > my >>>>>>>> > classes that 'are' inside HStats jar file. >>>>>>>> > Exception in thread "main" java.lang.NoClassDefFoundError: >>>>>>>> > com/*****/HAgent >>>>>>>> > at com.*****.FindSchoolsJob.run(FindSchoolsJob.java:46) >>>>>>>> > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) >>>>>>>> > at com.******.FindSchoolsJob.main(FindSchoolsJob.java:101) >>>>>>>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>>>>>>> > at >>>>>>>> > >>>>>>>> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >>>>>>>> > at >>>>>>>> > >>>>>>>> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >>>>>>>> > at java.lang.reflect.Method.invoke(Method.java:597) >>>>>>>> > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) >>>>>>>> > Caused by: java.lang.ClassNotFoundException:com/*****/HAgent >>>>>>>> > at java.net.URLClassLoader$1.run(URLClassLoader.java:202) >>>>>>>> > at java.security.AccessController.doPrivileged(Native Method) >>>>>>>> > at java.net.URLClassLoader.findClass(URLClassLoader.java:190) >>>>>>>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:307) >>>>>>>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:248) >>>>>>>> > ... 8 more >>>>>>>> > >>>>>>>> > My main class is defined as below : >>>>>>>> > public class FindSchoolsJob extends Configured implements Tool { >>>>>>>> > : >>>>>>>> > public int run(String[] args) throws Exception { >>>>>>>> > : >>>>>>>> > : >>>>>>>> > } >>>>>>>> > : >>>>>>>> > public static void main(String[] args) throws Exception { >>>>>>>> > int res = ToolRunner.run(new Configuration(), new FindSchoolsJob(), >>>>>>>> > args); >>>>>>>> > System.exit(res); >>>>>>>> > } >>>>>>>> > } >>>>>>>> > Any hint would be highly appreciated. >>>>>>>> > Thank You! >>>>>>>> > ~V >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Todd Lipcon >>>>>>>> Software Engineer, Cloudera >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Best Regards >>>>>>> Anty Rao >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Todd Lipcon >>>>>> Software Engineer, Cloudera >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Best Regards >>>>> Anty Rao >>>>> >>>> >>>> >>>> >>>> -- >>>> Todd Lipcon >>>> Software Engineer, Cloudera >>>> >>> >>> >>> >>> -- >>> Best Regards >>> Anty Rao >>> >> > > > > -- > Joseph Echeverria > Cloudera, Inc. > 443.305.9434 >