Yes the job doesn't even start, no mapping phase...it fails almost
instantly. I think I tried setting the HADOOP_CLASSPATH variable but
I'll do it again.

Thanks for your help,

Marco

On 15 September 2011 13:44, Joey Echeverria <j...@cloudera.com> wrote:
> Ok, but does the job even start the maps, or does it fail during initial 
> setup?
>
> The reason I ask is libjars only adds the jar to the classpath for the
> mappers and reducers. If you need the class before the job is
> submitted to the cluster, you should do something like this:
>
> HADOOP_CLASSPATH=../umd-hadoop-core/cloud9.jar hadoop jar myjob.jar
> myjob.driver.PreprocessANC -libjars ../umd-hadoop-core/cloud9.jar
> home/my/pyworkspace/openAnc.xml index/ 10 1
>
> -Joey
>
> On Thu, Sep 15, 2011 at 4:24 AM, Marco Didonna <m.didonn...@gmail.com> wrote:
>> Right now I am still in standalone mode ... I'd like to fix this issue
>> before starting a cluster on EC2. :)
>>
>> Thanks for your time
>>
>> Marco
>>
>> On 14 September 2011 14:04, Joey Echeverria <j...@cloudera.com> wrote:
>>> When are you getting the exception? Is it during the setup of your
>>> job, or after it's running on the cluster?
>>>
>>> -Joey
>>>
>>> On Wed, Sep 14, 2011 at 4:50 AM, Marco Didonna <m.didonn...@gmail.com> 
>>> wrote:
>>>> Hello everyone,
>>>> sorry to bring this up again but I need some clarification. I wrote a
>>>> map-reduce application that need cloud9 library
>>>> (https://github.com/lintool/Cloud9). This library is packet in a jar
>>>> file and I want to make it available to the whole cluster. So far I
>>>> have been working in standalone mode and I have unsuccessfully tried
>>>> to use the -libjars options. I always get ClassNotDefException: the
>>>> only way I made everything work fine is by copying the cloud9.jar into
>>>> hadoop/lib folder.
>>>> I suppose I cannot do it when using a cluster of N machines since I
>>>> would have to copy it on the N machines and this approach isn't
>>>> feasible.
>>>>
>>>> Here's how I perform the job "hadoop jar myjob.jar
>>>> myjob.driver.PreprocessANC -libjars ../umd-hadoop-core/cloud9.jar
>>>> home/my/pyworkspace/openAnc.xml index/ 10 1"
>>>>
>>>> Is there some code that needs to be written in the driver in order to
>>>> have the darn library added to the "global" classpath? This -libjars
>>>> option is really poor documented IMHO.
>>>>
>>>> Any help would be very much appreciated ;)
>>>>
>>>> Marco Didonna
>>>>
>>>> On 17 August 2011 03:57, Anty <anty....@gmail.com> wrote:
>>>>> Thanks very much , todd. I get it.
>>>>>
>>>>>
>>>>> On Wed, Aug 17, 2011 at 6:23 AM, Todd Lipcon <t...@cloudera.com> wrote:
>>>>>> Putting files on the classpath doesn't make them accessible to JVM's
>>>>>> resource loader. If you have dir/foo.properties, then "dir" needs to
>>>>>> be on the classpath, not "dir/foo.properties". Since the working dir
>>>>>> of the task is on the classpath, then -files works since it gets the
>>>>>> properties file into a directory on the classpath.
>>>>>>
>>>>>> -Todd
>>>>>>
>>>>>> On Mon, Aug 15, 2011 at 8:09 PM, Anty <anty....@gmail.com> wrote:
>>>>>>> thanks very much for you reply, todd.
>>>>>>> I am at a complete loss. I want to ship a configuration file to the
>>>>>>> cluster to run my mapreduce job.
>>>>>>>
>>>>>>> if I use -libjars option to ship the configuration file, the launched
>>>>>>> child JVM created  by task tracker
>>>>>>>  can't find the configuration file,curiously, the configuration file
>>>>>>> is already on the classpath of the child JVM.
>>>>>>>
>>>>>>> if I use -files option to ship the configuration file, the child JVM
>>>>>>> can find the file.
>>>>>>> IMO, what's the difference between -libjars and -files  is that -files
>>>>>>> will create a  symbol sink  to the configuration file
>>>>>>> in current workding directory of child JVM.
>>>>>>>
>>>>>>> I dig into the source code,but it's so complicated, i can't figure out
>>>>>>> the root cause of this.
>>>>>>> So my question is :
>>>>>>> with -libjars option ,the configuration file is already on the
>>>>>>> classpath, why classload can't the configuration file ,
>>>>>>> but why JVM classload CAN find the shipped jar with -libjars option?
>>>>>>>
>>>>>>> any help will be appreciated.
>>>>>>>
>>>>>>> On Tue, Aug 16, 2011 at 1:06 AM, Todd Lipcon <t...@cloudera.com> wrote:
>>>>>>>> Your "driver" is the program that submits the job. The task is the
>>>>>>>> thing that runs on the cluster. They have separate classpaths.
>>>>>>>>
>>>>>>>> Better to ask on the public lists if you want a more indepth 
>>>>>>>> explanation
>>>>>>>>
>>>>>>>> -Todd
>>>>>>>>
>>>>>>>> On Mon, Aug 15, 2011 at 9:02 AM, Anty <anty....@gmail.com> wrote:
>>>>>>>>> Hi:Todd
>>>>>>>>> Would you please explain a litter more?
>>>>>>>>>
>>>>>>>>> On Sat, Dec 11, 2010 at 2:08 AM, Todd Lipcon <t...@cloudera.com> 
>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> You need to put the library jar on your classpath (eg using
>>>>>>>>>> HADOOP_CLASSPATH) as well. The -libjars will ship it to the cluster
>>>>>>>>>> and put it on the classpath of your task, but not the classpath of
>>>>>>>>>> your "driver" code.
>>>>>>>>>>
>>>>>>>>> I still can't understand you mean by  " but not the classpath of
>>>>>>>>> your "driver" code."
>>>>>>>>>
>>>>>>>>> THX advance.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> -Todd
>>>>>>>>>>
>>>>>>>>>> On Thu, Dec 9, 2010 at 10:29 PM, Vipul Pandey <vipan...@gmail.com> 
>>>>>>>>>> wrote:
>>>>>>>>>> > disclaimer : a newbie!!!
>>>>>>>>>> > Howdy?
>>>>>>>>>> > Got a quick question. -libjars option doesn't seem to work for me 
>>>>>>>>>> > in -
>>>>>>>>>> > prettymuch - my first (or mayby second) mapreduce job.
>>>>>>>>>> > Here's what i'm doing :
>>>>>>>>>> > $bin/hadoop jar  sherlock.jar somepkg.FindSchoolsJob -libjars
>>>>>>>>>> >  HStats-1A18.jar input output
>>>>>>>>>> >
>>>>>>>>>> > sherlock.jar has my main class (ofcourse)  FindSchoolsJob, which 
>>>>>>>>>> > runs
>>>>>>>>>> > just
>>>>>>>>>> > fine by itself till I add a dependency on a class 
>>>>>>>>>> > in HStats-1A18.jar.
>>>>>>>>>> > When I run the above command with -libjars specified - it fails to 
>>>>>>>>>> > find
>>>>>>>>>> > my
>>>>>>>>>> > classes that 'are' inside HStats jar file.
>>>>>>>>>> > Exception in thread "main" java.lang.NoClassDefFoundError:
>>>>>>>>>> > com/*****/HAgent
>>>>>>>>>> > at com.*****.FindSchoolsJob.run(FindSchoolsJob.java:46)
>>>>>>>>>> > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>>>>>>>>> > at com.******.FindSchoolsJob.main(FindSchoolsJob.java:101)
>>>>>>>>>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>>>>>> > at
>>>>>>>>>> >
>>>>>>>>>> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>>>>>>>> > at
>>>>>>>>>> >
>>>>>>>>>> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>>>>>>>> > at java.lang.reflect.Method.invoke(Method.java:597)
>>>>>>>>>> > at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>>>>>>>>>> > Caused by: java.lang.ClassNotFoundException:com/*****/HAgent
>>>>>>>>>> > at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>>>>>>>>>> > at java.security.AccessController.doPrivileged(Native Method)
>>>>>>>>>> > at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>>>>>>>>>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
>>>>>>>>>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
>>>>>>>>>> > ... 8 more
>>>>>>>>>> >
>>>>>>>>>> > My main class is defined as below :
>>>>>>>>>> > public class FindSchoolsJob extends Configured implements Tool {
>>>>>>>>>> > :
>>>>>>>>>> > public int run(String[] args) throws Exception {
>>>>>>>>>> > :
>>>>>>>>>> > :
>>>>>>>>>> >               }
>>>>>>>>>> > :
>>>>>>>>>> > public static void main(String[] args) throws Exception {
>>>>>>>>>> > int res = ToolRunner.run(new Configuration(), new FindSchoolsJob(),
>>>>>>>>>> > args);
>>>>>>>>>> > System.exit(res);
>>>>>>>>>> > }
>>>>>>>>>> > }
>>>>>>>>>> > Any hint would be highly appreciated.
>>>>>>>>>> > Thank You!
>>>>>>>>>> > ~V
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Todd Lipcon
>>>>>>>>>> Software Engineer, Cloudera
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Best Regards
>>>>>>>>> Anty Rao
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Todd Lipcon
>>>>>>>> Software Engineer, Cloudera
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Best Regards
>>>>>>> Anty Rao
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Todd Lipcon
>>>>>> Software Engineer, Cloudera
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best Regards
>>>>> Anty Rao
>>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Joseph Echeverria
>>> Cloudera, Inc.
>>> 443.305.9434
>>>
>>
>
>
>
> --
> Joseph Echeverria
> Cloudera, Inc.
> 443.305.9434
>

Reply via email to