Re: Running Jobs with capacity scheduler on hadoop in eclipse

arun k Fri, 16 Sep 2011 21:00:49 -0700

Hi !

I am using hadoop on eclipse. I am able to run jobs.
My question was how can i run some application with capacity scheduler ?
Also I am unable to see Jobtracker in Web GUI at default Jobtracker address.
I have not edited any *-site.xmls.



Thanks,
Arun


On Fri, Sep 16, 2011 at 3:56 PM, Swathi V <swat...@zinniasystems.com> wrote:

> It might be in safemode. turn off the safemode by using
> *bin/hadoop dfsadmin safemode -leave*
> to see the task tracker.
>
>
> On Fri, Sep 16, 2011 at 2:09 PM, arun k <arunk...@gmail.com> wrote:
>
>> Hi !
>>
>> I have setup hadoop0.20.2 on eclipse Helios and able to run the Example
>> wordcount using ExampleDriver class as mentioned by Faraz in
>> http://lucene.472066.n3.nabble.com/HELP-configuring-hadoop-on-ECLIPSE-td1086829.html#a2241534
>>
>> Two questions ?
>> 1. I am unable to see the jobtracker & others in browser at http addr
>> mentioned in mapred- default.xml. i have not edited any site.xml files.
>>  I have tried to edit site.xml files as per Michael noll site but that
>> didn't help.
>>
>> 2.Capacity Scheduler :I see the capacity -*- jar in lib folder. I have
>> modified mapred-site.xml and capacity-scheduler.xml as required. How do i
>> run some application jobs by submitting a job to a queue in this case ?
>> I have tried to run:
>> Program & Args as : wordcount -Dmapred.job.queue.name=myqueue1
>> input_file_loc output_file_loc
>> But i get error :
>> Exception in thread "main" java.lang.Error: Unresolved compilation
>> problems:
>>         ProgramDriver cannot be resolved to a type
>>         ProgramDriver cannot be resolved to a type
>>         DistributedPentomino cannot be resolved to a type
>>          .................
>>
>> Thanks,
>> Arun
>>
>>
>> On Fri, Sep 16, 2011 at 12:46 PM, Harsh J <ha...@cloudera.com> wrote:
>>
>>> Arun,
>>>
>>> Good to know. Happy Hadoopin'!
>>>
>>> On Fri, Sep 16, 2011 at 12:34 PM, arun k <arunk...@gmail.com> wrote:
>>> > Hi !
>>> > Thanks Harsh !
>>> > The problem was that i have set up queue info in mapred-site.xml
>>> instead of
>>> > capacity-scheduler.xml .
>>> > Arun
>>> >
>>> > On Fri, Sep 16, 2011 at 10:52 AM, Harsh J <ha...@cloudera.com> wrote:
>>> >>
>>> >> Arun,
>>> >>
>>> >> Please do not cross-post to multiple lists. Lets continue this on
>>> >> mapreduce-user@ alone.
>>> >>
>>> >> Your problem isn't the job submission here, but your Capacity
>>> >> Scheduler configuration. For every queue you configure, you need to
>>> >> add in capacities: Please see the queue properties documentation at
>>> >>
>>> >>
>>> http://hadoop.apache.org/common/docs/current/capacity_scheduler.html#Queue+properties
>>> >> for the vital configs required in additional to mapred.queue.names.
>>> >> Once done, you should have a fully functional JobTracker!
>>> >>
>>> >> On Fri, Sep 16, 2011 at 10:17 AM, arun k <arunk...@gmail.com> wrote:
>>> >> > Hi all !
>>> >> >
>>> >> > Harsh ! Namenode appears to be out of safe mode :
>>> >> > In http://nn-host:50070 i see in time
>>> >> >
>>> >> > T1>Safe mode is ON. The ratio of reported blocks 0.0000 has not
>>> reached
>>> >> > the
>>> >> > threshold 0.9990. Safe mode will be turned off automatically.
>>> >> > 7 files and directories, 1 blocks = 8 total. Heap Size is 15.06 MB /
>>> >> > 966.69
>>> >> > MB (1%)
>>> >> >
>>> >> > T2>Safe mode is ON. The ratio of reported blocks 1.0000 has reached
>>> the
>>> >> > threshold 0.9990. Safe mode will be turned off automatically in 17
>>> >> > seconds.
>>> >> > 7 files and directories, 1 blocks = 8 total. Heap Size is 15.06 MB /
>>> >> > 966.69
>>> >> > MB (1%)
>>> >> >
>>> >> > T3>9 files and directories, 3 blocks = 12 total. Heap Size is 15.06
>>> MB /
>>> >> > 966.69 MB (1%)
>>> >> >
>>> >> > Added properties :
>>> >> >
>>> >> >  mapred.jobtracker.taskScheduler org.apache.hadoopertiep.mapred.CTS
>>> >> >
>>> >> >  mapred.queue.names                          myqueue1,myqueue2
>>> >> >  mapred.capacity-scheduler.queue.myqueue1.capacity               25
>>> >> >  mapred.capacity-scheduler.queue.myqueue1.capacity               75
>>> >> > ${HADOOP_HOME}$ bin/hadoop jar hadoop*examples*.jar wordcount
>>> >> > -Dmapred.job.queue.name=
>>> >> > myqueue1 /user/hduser/wcinput /user/hduser/wcoutput
>>> >> >
>>> >> > I get the error:
>>> >> > java.io.IOException: Call to localhost/127.0.0.1:54311 failed on
>>> local
>>> >> > exception: java.io.IOException: Connection reset by peer
>>> >> >     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1065)
>>> >> >     at org.apache.hadoop.ipc.Client.call(Client.java:1033)
>>> >> >     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:224)
>>> >> >         ...................
>>> >> >
>>> >> > When i give
>>> >> > $jps
>>> >> > 32463 NameNode
>>> >> > 32763 SecondaryNameNode
>>> >> > 32611 DataNode
>>> >> > 931 Jps
>>> >> >
>>> >> >
>>> >> > The jobracker log gives info
>>> >> >
>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>> >> > 2011-09-16 00:21:42,012 INFO org.apache.hadoop.mapred.JobTracker:
>>> >> > Cleaning
>>> >> > up the system directory
>>> >> > 2011-09-16 00:21:42,014 INFO org.apache.hadoop.mapred.JobTracker:
>>> >> > problem
>>> >> > cleaning system directory:
>>> >> > hdfs://localhost:54310/app203/hadoop203/tmp/mapred/system
>>> >> > org.apache.hadoop.ipc.RemoteException:
>>> >> > org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot
>>> delete
>>> >> > /app203/hadoop203/tmp/mapred/system. Name node is in safe mode.
>>> >> > The ratio of reported blocks 1.0000 has reached the threshold
>>> 0.9990.
>>> >> > Safe
>>> >> > mode will be turned off automatically in 6 seconds.
>>> >> >     at
>>> >> >
>>> >> >
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:1851)
>>> >> >     at
>>> >> >
>>> >> >
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:1831)
>>> >> > 2011-09-16 00:21:52,321 FATAL org.apache.hadoop.mapred.JobTracker:
>>> >> > java.io.IOException: Queue 'myqueue1' doesn't have configured
>>> capacity!
>>> >> >     at
>>> >> >
>>> >> >
>>> org.apache.hadoop.mapred.CapacityTaskScheduler.parseQueues(CapacityTaskScheduler.java:905)
>>> >> >     at
>>> >> >
>>> >> >
>>> org.apache.hadoop.mapred.CapacityTaskScheduler.start(CapacityTaskScheduler.java:822)
>>> >> >     at
>>> >> >
>>> org.apache.hadoop.mapred.JobTracker.offerService(JobTracker.java:2563)
>>> >> >     at
>>> org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:4957)
>>> >> >
>>> >> > 2011-09-16 00:21:52,322 INFO org.apache.hadoop.mapred.JobTracker:
>>> >> > SHUTDOWN_MSG:
>>> >> >
>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>> >> >
>>> >> > Even if i submit the job to "myqueue2" i see the same error of
>>> >> > "myqueue1"
>>> >> > 2011-09-16 00:21:52,321 FATAL org.apache.hadoop.mapred.JobTracker:
>>> >> > java.io.IOException: Queue 'myqueue1' doesn't have configured
>>> capacity!
>>> >> >
>>> >> > Thanks,
>>> >> > Arun
>>> >> >
>>> >> >
>>> >> >
>>> >> > On Thu, Sep 15, 2011 at 5:23 PM, Harsh J <ha...@cloudera.com>
>>> wrote:
>>> >> >>
>>> >> >> Hello Arun,
>>> >> >>
>>> >> >> To me it looks like your HDFS isn't setup properly, in this case.
>>> Can
>>> >> >> you ensure all DNs are properly up? Your NN appears to have gotten
>>> >> >> stuck somehow into a safemode. Check out your http://nn-host:50070
>>> >> >> page for more details on why.
>>> >> >>
>>> >> >> Your JT won't come up until the NN is properly up and out of
>>> safemode
>>> >> >> (for which it needs the DNs). And once it comes up, I think you
>>> should
>>> >> >> be good to go, keeping in mind the changes Thomas mentioned
>>> earlier.
>>> >> >>
>>> >> >> On Thu, Sep 15, 2011 at 3:58 PM, arun k <arunk...@gmail.com>
>>> wrote:
>>> >> >> > Hi all !
>>> >> >> >
>>> >> >> > Thanks Thomas ! it's working in terminal.
>>> >> >> > I saw the queues in web UI of JT.
>>> >> >> > when i try to run normally again (default) i get this error :
>>> >> >> > i tried formatting namenode and making safemode off and restart
>>> but
>>> >> >> > didn't
>>> >> >> > work.
>>> >> >> >
>>> >> >> > hduser@arun-Presario-C500-RU914PA-ACJ:/usr/local/hadoop$
>>> bin/hadoop
>>> >> >> > jar
>>> >> >> > hadoop*examples*.jar wordcount  /user/hduser/wcinput
>>> >> >> > /user/hduser/wcoutput6
>>> >> >> > java.io.IOException: Call to localhost/127.0.0.1:54311 failed on
>>> >> >> > local
>>> >> >> > exception: java.io.IOException: Connection reset by peer
>>> >> >> >
>>> >> >> > The log of JobTracker shows :
>>> >> >> > 2011-09-15 12:46:13,346 INFO org.apache.hadoop.mapred.JobTracker:
>>> >> >> > JobTracker
>>> >> >> > up at: 54311
>>> >> >> > 2011-09-15 12:46:13,347 INFO org.apache.hadoop.mapred.JobTracker:
>>> >> >> > JobTracker
>>> >> >> > webserver: 50030
>>> >> >> > 2011-09-15 12:46:13,634 INFO org.apache.hadoop.mapred.JobTracker:
>>> >> >> > Cleaning
>>> >> >> > up the system directory
>>> >> >> > 2011-09-15 12:46:13,646 INFO org.apache.hadoop.mapred.JobTracker:
>>> >> >> > problem
>>> >> >> > cleaning system directory:
>>> >> >> > hdfs://localhost:54310/app/hadoop/tmp/mapred/system
>>> >> >> > org.apache.hadoop.ipc.RemoteException:
>>> >> >> > org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot
>>> >> >> > delete
>>> >> >> > /app/hadoop/tmp/mapred/system. Name node is in safe mode.
>>> >> >> >
>>> >> >> > Thanks,
>>> >> >> > Arun
>>> >> >> >
>>> >> >> >
>>> >> >> >
>>> >> >> >
>>> >> >> >
>>> >> >> >
>>> >> >> > On Wed, Sep 14, 2011 at 7:46 PM, Thomas Graves
>>> >> >> > <tgra...@yahoo-inc.com>
>>> >> >> > wrote:
>>> >> >> >>
>>> >> >> >> I believe it defaults to submit a job to the default queue if
>>> you
>>> >> >> >> don't
>>> >> >> >> specify it.  You don't have the default queue defined in your
>>> list
>>> >> >> >> of
>>> >> >> >> mapred.queue.names.  So add -Dmapred.job.queue.name=myqueue1
>>> (or
>>> >> >> >> another
>>> >> >> >> queue you have defined) to the wordcount command like:
>>> >> >> >>
>>> >> >> >> bin/hadoop jar
>>> >> >> >> > hadoop*examples*.jar wordcount -Dmapred.job.queue.name
>>> =myqueue1
>>> >> >> >> /user/hduser/wcinput /user/hduser/wcoutput5
>>> >> >> >>
>>> >> >> >> Tom
>>> >> >> >>
>>> >> >> >>
>>> >> >> >> On 9/14/11 5:57 AM, "arun k" <arunk...@gmail.com> wrote:
>>> >> >> >>
>>> >> >> >> > Hi !
>>> >> >> >> >
>>> >> >> >> > I have set up single-node cluster using
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> >
>>> http://www.google.co.in/url?sa=t&source=web&cd=1&ved=0CB0QFjAA&url=http%3A%2F%
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> > 2Fwww.michael-noll.com
>>> %2Ftutorials%2Frunning-hadoop-on-ubuntu-linux-single-nod
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> >
>>> e-cluster%2F&rct=j&q=michael%20noll%20single%20node&ei=b4ZwTvrCLsOrrAei-N32Bg&
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> >
>>> usg=AFQjCNGhuvv0tNdvPj4u23bbj-qXJDlixg&sig2=7ij8Dy7aQZUkBwhTnS1rLw&cad=rja
>>> >> >> >> > and could run wordcount example application.
>>> >> >> >> > I was trying to run this application using capacity scheduler.
>>> >> >> >> > As per
>>> >> >> >> >
>>> >> >> >> >
>>> http://hadoop.apache.org/common/docs/current/capacity_scheduler.htmli
>>> >> >> >> > have done :
>>> >> >> >> > 1.Copied the hadoop-capacity-scheduler-*.jar from *
>>> >> >> >> > contrib/capacity-scheduler* directory to HADOOP_HOME/lib
>>> >> >> >> > 2.Set mapred.jobtracker.taskScheduler
>>> >> >> >> > 3.Set *mapred.queue.names to myqueue1,myqueue2.
>>> >> >> >> > 4.Set *mapred.capacity-scheduler.queue.<queue-name>.capacity
>>> to 30
>>> >> >> >> > and
>>> >> >> >> > 70
>>> >> >> >> > for two queues.
>>> >> >> >> >
>>> >> >> >> > When i run i get the error :
>>> >> >> >> > hduser@arun-Presario-C500-RU914PA-ACJ:/usr/local/hadoop$
>>> >> >> >> > bin/hadoop
>>> >> >> >> > jar
>>> >> >> >> > hadoop*examples*.jar wordcount /user/hduser/wcinput
>>> >> >> >> > /user/hduser/wcoutput5
>>> >> >> >> > 11/09/14 16:00:56 INFO input.FileInputFormat: Total input
>>> paths to
>>> >> >> >> > process :
>>> >> >> >> > 4
>>> >> >> >> > org.apache.hadoop.ipc.RemoteException: java.io.IOException:
>>> Queue
>>> >> >> >> > "default"
>>> >> >> >> > does not exist
>>> >> >> >> >     at
>>> >> >> >> >
>>> >> >> >> >
>>> org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:2998)
>>> >> >> >> >     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>>> Method)
>>> >> >> >> >     at
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> >
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>> >> >> >> >     at
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> >
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.j
>>> >> >> >> > ava:25)
>>> >> >> >> >     at java.lang.reflect.Method.invoke(Method.java:597)
>>> >> >> >> >     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
>>> >> >> >> >     at
>>> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
>>> >> >> >> >     at
>>> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
>>> >> >> >> >     at java.security.AccessController.doPrivileged(Native
>>> Method)
>>> >> >> >> >     at javax.security.auth.Subject.doAs(Subject.java:396)
>>> >> >> >> >     at
>>> org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
>>> >> >> >> >
>>> >> >> >> >     at org.apache.hadoop.ipc.Client.call(Client.java:740)
>>> >> >> >> >     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
>>> >> >> >> >     at org.apache.hadoop.mapred.$Proxy0.submitJob(Unknown
>>> Source)
>>> >> >> >> >     at
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> >
>>> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:800)
>>> >> >> >> >     at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
>>> >> >> >> >     at
>>> >> >> >> >
>>> org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
>>> >> >> >> >     at
>>> >> >> >> > org.apache.hadoop.examples.WordCount.main(WordCount.java:67)
>>> >> >> >> >     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>>> Method)
>>> >> >> >> >     at
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> >
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>> >> >> >> >     at
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> >
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.j
>>> >> >> >> > ava:25)
>>> >> >> >> >     at java.lang.reflect.Method.invoke(Method.java:597)
>>> >> >> >> >     at
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> >
>>> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.j
>>> >> >> >> > ava:68)
>>> >> >> >> >     at
>>> >> >> >> >
>>> >> >> >> >
>>> org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>>> >> >> >> >     at
>>> >> >> >> >
>>> >> >> >> >
>>> org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
>>> >> >> >> >     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>>> Method)
>>> >> >> >> >     at
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> >
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>> >> >> >> >     at
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> >
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.j
>>> >> >> >> > ava:25)
>>> >> >> >> >     at java.lang.reflect.Method.invoke(Method.java:597)
>>> >> >> >> >     at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>>> >> >> >> >
>>> >> >> >> > I didn't submit jobs to a particular queue as such. Do i need
>>> to
>>> >> >> >> > do
>>> >> >> >> > it ?
>>> >> >> >> > How
>>> >> >> >> > can i do it ?
>>> >> >> >> > Any help ?
>>> >> >> >> >
>>> >> >> >> > Thanks,
>>> >> >> >> > Arun
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> > *
>>> >> >> >> > *
>>> >> >> >>
>>> >> >> >
>>> >> >> >
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >> --
>>> >> >> Harsh J
>>> >> >
>>> >> >
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Harsh J
>>> >
>>> >
>>>
>>>
>>>
>>> --
>>> Harsh J
>>>
>>
>>
>
>
> --
> Regards,
> Swathi.V.
>
>

Re: Running Jobs with capacity scheduler on hadoop in eclipse

Reply via email to