Problem with running the job, no default queue
Tried a simple example job with Yahoo M45. The job fails for non-existence of a default queue. Output is attached as below. From the Apache hadoop mailing list, found this post (specific to M45), that attacked this problem by setting the property Dmapred.job.queue.name=*myqueue* (http://web.archiveorange.com/archive/v/3inw3ySGHmNRR9Bm14Uv) There is also documentation set for capacity schedulers, but I do not have write access to the files in conf directory, so I do not know how I can set the capacity schedulers there. I am also posting this question on the general lists, just in case. $hadoop jar /grid/0/gs/hadoop/current/hadoop-examples.jar pi 10 1 Number of Maps = 10 Samples per Map = 1 Wrote input for Map #0 Wrote input for Map #1 Wrote input for Map #2 Wrote input for Map #3 Wrote input for Map #4 Wrote input for Map #5 Wrote input for Map #6 Wrote input for Map #7 Wrote input for Map #8 Wrote input for Map #9 Starting Job 11/02/10 04:19:22 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 83705 for sgrao 11/02/10 04:19:22 INFO security.TokenCache: Got dt for hdfs://grit-nn1.yahooresearchcluster.com/user/sgrao/.staging/job_201101150035_26053;uri=68.180.138.10:8020;t.service=68.180.138.10:8020 11/02/10 04:19:22 INFO mapred.FileInputFormat: Total input paths to process : 10 11/02/10 04:19:23 INFO mapred.JobClient: Cleaning up the staging area hdfs://grit-nn1.yahooresearchcluster.com/user/sgrao/.staging/job_201101150035_26053 org.apache.hadoop.ipc.RemoteException: java.io.IOException: Queue "default" does not exist at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:3680) at sun.reflect.GeneratedMethodAccessor32.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1301) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1297) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1062) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1295) at org.apache.hadoop.ipc.Client.call(Client.java:951) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:223) at org.apache.hadoop.mapred.$Proxy6.submitJob(Unknown Source) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:818) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:752) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1062) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:752) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:726) at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1156) at org.apache.hadoop.examples.PiEstimator.estimate(PiEstimator.java:297) at org.apache.hadoop.examples.PiEstimator.run(PiEstimator.java:342) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.examples.PiEstimator.main(PiEstimator.java:351) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68) at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Re: Problem with running the job, no default queue
Hello Koji, Thanks for your email. Is there a M45 specific mailing list. That would be really beneficial. I tried the hadoop queue command and it gives me Queue acls for user : sgrao Queue Operations = m45 submit-job need I specify this queue name in my properties? or modify the mapred-queue-acls.xml file. I do not think I have the authorization to do so. I even ran a info on the queue name I found "m45" $hadoop queue -info m45 Queue Name : m45 Scheduling Info : Queue configuration Capacity Percentage: 90.0% User Limit: 20% Priority Supported: NO - Map tasks Capacity: 684 slots Used capacity: 400 (58.5% of Capacity) Running tasks: 200 Active users: User 'scohen': 400 (100.0% of used capacity) - Reduce tasks Capacity: 342 slots Used capacity: 5 (1.5% of Capacity) Running tasks: 5 Active users: User 'ukang': 5 (100.0% of used capacity) - Job info Number of Waiting Jobs: 2 Number of users who have submitted jobs: 2 How do I request for a queue. I tried adding -Dmapred.queue.name to the end of the hadoop command hadoop jar /grid/0/gs/hadoop/current/hadoop-examples.jar pi 10 1 -Dmapred.queue.name=m45 It gives me a usage error, as if I cannot specify the queue or I am using the wrong syntax. I have not been able to find the right syntax. So I am not sure how to specify the queue name or request for a queue. Regards, Shivani - Original Message ----- From: "Koji Noguchi" To: "Shivani Rao" , "Tim Korb" Cc: "Viraj Bhat" , "Avinash C Kak" , common-user@hadoop.apache.org, common-...@hadoop.apache.org Sent: Monday, February 14, 2011 1:12:49 PM Subject: Re: Problem with running the job, no default queue Hi Shivani, You probably don’t want to ask m45 specific questions on hadoop.apache mailing list. Try % hadoop queue –showacls It should show which queues you’re allowed to submit. If it doesn’t give you any queues, you need to request one. Koji On 2/9/11 9:10 PM, "Shivani Rao" < sg...@purdue.edu > wrote: Tried a simple example job with Yahoo M45. The job fails for non-existence of a default queue. Output is attached as below. From the Apache hadoop mailing list, found this post (specific to M45), that attacked this problem by setting the property Dmapred.job.queue.name=*myqueue* ( http://web.archiveorange.com/archive/v/3inw3ySGHmNRR9Bm14Uv ) There is also documentation set for capacity schedulers, but I do not have write access to the files in conf directory, so I do not know how I can set the capacity schedulers there. I am also posting this question on the general lists, just in case. $hadoop jar /grid/0/gs/hadoop/current/hadoop-examples.jar pi 10 1 Number of Maps = 10 Samples per Map = 1 Wrote input for Map #0 Wrote input for Map #1 Wrote input for Map #2 Wrote input for Map #3 Wrote input for Map #4 Wrote input for Map #5 Wrote input for Map #6 Wrote input for Map #7 Wrote input for Map #8 Wrote input for Map #9 Starting Job 11/02/10 04:19:22 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 83705 for sgrao 11/02/10 04:19:22 INFO security.TokenCache: Got dt for hdfs://grit-nn1.yahooresearchcluster.com/user/sgrao/.staging/job_201101150035_26053;uri=68.180.138.10:8020;t.service=68.180.138.10:8020 11/02/10 04:19:22 INFO mapred.FileInputFormat: Total input paths to process : 10 11/02/10 04:19:23 INFO mapred.JobClient: Cleaning up the staging area hdfs://grit-nn1.yahooresearchcluster.com/user/sgrao/.staging/job_201101150035_26053 org.apache.hadoop.ipc.RemoteException: java.io.IOException: Queue "default" does not exist at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:3680) at sun.reflect.GeneratedMethodAccessor32.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1301) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1297) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1062) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1295) at org.apache.hadoop.ipc.Client.call(Client.java:951) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:223) at org.apache.hadoop.mapred.$Proxy6.submitJob(Unknown Source) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:818) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:752) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGro
Specify queue name for a hadoop job
I am having trouble specifying queue name where a job needs to be submitted. I would like to know if I am using the right syntax. I find the name of the queue that is assigned to me $hadoop queue -showacls and find that its I then run a simple pi estimate using the following command $hadoop jar /path to hadoop examples/hadoop-examples.jar pi 10 1 -Dmapred.queue.name= and I get a syntax error If I don't specify the queue name i get a java.IOexception saying "could not find "default" queue " If anybody has specified queues before your help is appreciated Regards, Shivani
a hadoop input format question
I am running basic hadoop examples on amazon emr and I am stuck at a very simple place. I am apparently not passing the right "classname" for inputFormat >From hadoop documentation it seems like "TextInputFormat" is a valid option for input format I am running a simple sort example using mapreduce. Here is the command variations I tried, all to vain: $usr/local/hadoop/bin/hadoop jar /path to hadoop examples/hadoop-0.18.0-examples.jar sort -inFormat TextInputFormat -outFormat TextOutputFormat /path to datainput/datain/ /path to data output/dataout The sort function does not declare "TextInputFormat" in its import list. Could that be a problem ? Could it be a version problem? Any help is aprpeciated! Shivani -- Research Scholar, School of Electrical and Computer Engineering Purdue University West Lafayette IN web.ics.purdue.edu/~sgrao
Hadoop 0.21 running problems , no namenode to stop
Problems running local installation of hadoop on single-node cluster I followed instructions given by tutorials to run hadoop-0.21 on a single node cluster. The first problem I encountered was that of HADOOP-6953. Thankfully that has got fixed. The other problem I am facing is that the datanode does not start. This I guess because when I run stop-dfs.sh for datanode, I get a message "no datanode to stop" I am wondering if it is related remotely to the difference in the IP addresses on my computer 127.0.0.1 localhost 127.0.1.1 my-laptop Although I am aware of this, I do not know how to fix this. I am unable to even run a simple pi estimate example on the haddop installation This is the output I get is bin/hadoop jar hadoop-mapred-examples-0.21.0.jar pi 10 10 Number of Maps = 10 Samples per Map = 10 11/03/02 23:38:47 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=30 And nothing else for long long time. I have not set the dfs.namedir and dfs.datadir in my hdfs-site.xml. But After running bin/hadoop namenode -format, I see that the tmp.dir has a folder with dfs/data and dfs/data folders for the two directories. what Am I doing wrong? Any help is appreciated. Here are my configuration files Regards, Shivani hdfs-site.xml dfs.replication 1 Default block replication. The actual number of replications can be specified when the file is created. The default is used if replication is not specified in create time. core-site.xml hadoop.tmp.dir /usr/local/hadoop-${user.name} A base for other temporary directories. fs.default.name hdfs://localhost:54310 The name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. The uri's scheme determines the config property (fs.SCHEME.impl) naming the FileSystem implementation class. The uri's authority is used to determine the host, port, etc. for a filesystem. mapred-site.xml mapred.job.tracker localhost:54311 The host and port that the MapReduce job tracker runs at. If "local", then jobs are run in-process as a single map and reduce task.
hadoop streaming and job conf settings
Hello, I am facing trouble using hadoop streaming in order to solve a simple nearest neighbor problem. Input data is in the following format '\t' key is the imageid for which nearest neighbor will be computed the value is 100 dimensional vector of floating point values separated by space or tab The mapper reads in the query (the query is a 100 dimensional vector) and each line of the input and outputs a where key2 is a floating point value indicating the distance, and value2 is the imageid The number of reducers is set to 1. And the reducer is set to be the identity reducer. I tried to use the following command bin/hadoop jar ./mapred/contrib/streaming/hadoop-0.21.0-streaming.jar -Dmapreduce.job.output.key.class=org.apache.hadoop.io.DoubleWritable -files /home/shivani/research/toolkit/mathouttuts/nearestneighbor/code/IdentityMapper.R#file1 -input datain/comparedata -output dataout5 -mapper file1 -reducer org.apache.hadoop.mapred.lib.IdentityReducer -verbose This is the output stream is as below. The failure is in the mapper itself, more specifically the TEXTOUTPUTREADER. I am not sure how to fix this. The logs are attached below: 11/04/13 13:22:15 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=30 11/04/13 13:22:15 WARN conf.Configuration: mapred.used.genericoptionsparser is deprecated. Instead, use mapreduce.client.genericoptionsparser.used STREAM: addTaskEnvironment= STREAM: shippedCanonFiles_=[] STREAM: shipped: false /usr/local/hadoop/file1 STREAM: cmd=file1 STREAM: cmd=null STREAM: shipped: false /usr/local/hadoop/org.apache.hadoop.mapred.lib.IdentityReducer STREAM: cmd=org.apache.hadoop.mapred.lib.IdentityReducer 11/04/13 13:22:15 WARN conf.Configuration: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id STREAM: Found runtime classes in: /usr/local/hadoop-hadoop/hadoop-unjar7358684340334149267/ packageJobJar: [/usr/local/hadoop-hadoop/hadoop-unjar7358684340334149267/] [] /tmp/streamjob2923554781371902680.jar tmpDir=null JarBuilder.addNamedStream META-INF/MANIFEST.MF JarBuilder.addNamedStream org/apache/hadoop/typedbytes/TypedBytesWritable.class JarBuilder.addNamedStream org/apache/hadoop/typedbytes/TypedBytesRecordOutput$1.class JarBuilder.addNamedStream org/apache/hadoop/typedbytes/TypedBytesWritableOutput$1.class JarBuilder.addNamedStream org/apache/hadoop/typedbytes/TypedBytesRecordOutput.class JarBuilder.addNamedStream org/apache/hadoop/typedbytes/TypedBytesOutput.class JarBuilder.addNamedStream org/apache/hadoop/typedbytes/TypedBytesOutput$1.class JarBuilder.addNamedStream org/apache/hadoop/typedbytes/TypedBytesInput$1.class JarBuilder.addNamedStream org/apache/hadoop/typedbytes/TypedBytesWritableOutput.class JarBuilder.addNamedStream org/apache/hadoop/typedbytes/TypedBytesRecordInput$TypedBytesIndex.class JarBuilder.addNamedStream org/apache/hadoop/typedbytes/TypedBytesWritableInput$2.class JarBuilder.addNamedStream org/apache/hadoop/typedbytes/TypedBytesWritableInput.class JarBuilder.addNamedStream org/apache/hadoop/typedbytes/TypedBytesRecordInput.class JarBuilder.addNamedStream org/apache/hadoop/typedbytes/Type.class JarBuilder.addNamedStream org/apache/hadoop/typedbytes/TypedBytesWritableInput$1.class JarBuilder.addNamedStream org/apache/hadoop/typedbytes/TypedBytesRecordInput$1.class JarBuilder.addNamedStream org/apache/hadoop/typedbytes/TypedBytesInput.class JarBuilder.addNamedStream org/apache/hadoop/streaming/StreamUtil$TaskId.class JarBuilder.addNamedStream org/apache/hadoop/streaming/PipeMapRed$1.class JarBuilder.addNamedStream org/apache/hadoop/streaming/StreamJob.class JarBuilder.addNamedStream org/apache/hadoop/streaming/StreamUtil.class JarBuilder.addNamedStream org/apache/hadoop/streaming/Environment.class JarBuilder.addNamedStream org/apache/hadoop/streaming/io/RawBytesOutputReader.class JarBuilder.addNamedStream org/apache/hadoop/streaming/io/TypedBytesInputWriter.class JarBuilder.addNamedStream org/apache/hadoop/streaming/io/TextInputWriter.class JarBuilder.addNamedStream org/apache/hadoop/streaming/io/InputWriter.class JarBuilder.addNamedStream org/apache/hadoop/streaming/io/TextOutputReader.class JarBuilder.addNamedStream org/apache/hadoop/streaming/io/IdentifierResolver.class JarBuilder.addNamedStream org/apache/hadoop/streaming/io/RawBytesInputWriter.class JarBuilder.addNamedStream org/apache/hadoop/streaming/io/TypedBytesOutputReader.class JarBuilder.addNamedStream org/apache/hadoop/streaming/io/OutputReader.class JarBuilder.addNamedStream org/apache/hadoop/streaming/PipeMapRed.class JarBuilder.addNamedStream org/apache/hadoop/streaming/PathFinder.class JarBuilder.addNamedStream org/apache/hadoop/streaming/LoadTypedBytes.class JarBuilder.addNamedStream org/apache/hadoop/streaming/StreamXmlRecordReader.class JarBuilder.addNamedStream org/apache/hadoop/streaming/UTF8ByteArrayUtils.class JarBuilder.addNamedStream org/apache/hadoop/streaming/JarBuilder.class JarBuilder.addNamedSt
hadoop streaming and job conf settings, error in textoutputreader
Hello, I am facing trouble using hadoop streaming in order to solve a simple nearest neighbor problem. Input data is in the following format '\t' key is the imageid for which nearest neighbor will be computed the value is 100 dimensional vector of floating point values separated by space or tab The mapper reads in the query (the query is a 100 dimensional vector) and each line of the input and outputs a where key2 is a floating point value indicating the distance, and value2 is the imageid The number of reducers is set to 1. And the reducer is set to be the identity reducer. I tried to use the following command bin/hadoop jar ./mapred/contrib/streaming/hadoop-0.21.0-streaming.jar -Dmapreduce.job.output.key.class=org.apache.hadoop.io.DoubleWritable -files /home/shivani/research/toolkit/mathouttuts/nearestneighbor/code/IdentityMapper.R#file1 -input datain/comparedata -output dataout5 -mapper file1 -reducer org.apache.hadoop.mapred.lib.IdentityReducer -verbose This is the output stream is as below. The failure is in the mapper itself, more specifically the TEXTOUTPUTREADER. I am not sure how to fix this. The logs are attached below: 11/04/13 13:22:15 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=30 11/04/13 13:22:15 WARN conf.Configuration: mapred.used.genericoptionsparser is deprecated. Instead, use mapreduce.client.genericoptionsparser.used STREAM: addTaskEnvironment= STREAM: shippedCanonFiles_=[] STREAM: shipped: false /usr/local/hadoop/file1 STREAM: cmd=file1 STREAM: cmd=null STREAM: shipped: false /usr/local/hadoop/org.apache.hadoop.mapred.lib.IdentityReducer STREAM: cmd=org.apache.hadoop.mapred.lib.IdentityReducer 11/04/13 13:22:15 WARN conf.Configuration: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id STREAM: Found runtime classes in: /usr/local/hadoop-hadoop/hadoop-unjar7358684340334149267/ packageJobJar: [/usr/local/hadoop-hadoop/hadoop-unjar7358684340334149267/] [] /tmp/streamjob2923554781371902680.jar tmpDir=null JarBuilder.addNamedStream META-INF/MANIFEST.MF JarBuilder.addNamedStream org/apache/hadoop/typedbytes/TypedBytesWritable.class JarBuilder.addNamedStream org/apache/hadoop/typedbytes/TypedBytesRecordOutput$1.class JarBuilder.addNamedStream org/apache/hadoop/typedbytes/TypedBytesWritableOutput$1.class JarBuilder.addNamedStream org/apache/hadoop/typedbytes/TypedBytesRecordOutput.class JarBuilder.addNamedStream org/apache/hadoop/typedbytes/TypedBytesOutput.class JarBuilder.addNamedStream org/apache/hadoop/typedbytes/TypedBytesOutput$1.class JarBuilder.addNamedStream org/apache/hadoop/typedbytes/TypedBytesInput$1.class JarBuilder.addNamedStream org/apache/hadoop/typedbytes/TypedBytesWritableOutput.class JarBuilder.addNamedStream org/apache/hadoop/typedbytes/TypedBytesRecordInput$TypedBytesIndex.class JarBuilder.addNamedStream org/apache/hadoop/typedbytes/TypedBytesWritableInput$2.class JarBuilder.addNamedStream org/apache/hadoop/typedbytes/TypedBytesWritableInput.class JarBuilder.addNamedStream org/apache/hadoop/typedbytes/TypedBytesRecordInput.class JarBuilder.addNamedStream org/apache/hadoop/typedbytes/Type.class JarBuilder.addNamedStream org/apache/hadoop/typedbytes/TypedBytesWritableInput$1.class JarBuilder.addNamedStream org/apache/hadoop/typedbytes/TypedBytesRecordInput$1.class JarBuilder.addNamedStream org/apache/hadoop/typedbytes/TypedBytesInput.class JarBuilder.addNamedStream org/apache/hadoop/streaming/StreamUtil$TaskId.class JarBuilder.addNamedStream org/apache/hadoop/streaming/PipeMapRed$1.class JarBuilder.addNamedStream org/apache/hadoop/streaming/StreamJob.class JarBuilder.addNamedStream org/apache/hadoop/streaming/StreamUtil.class JarBuilder.addNamedStream org/apache/hadoop/streaming/Environment.class JarBuilder.addNamedStream org/apache/hadoop/streaming/io/RawBytesOutputReader.class JarBuilder.addNamedStream org/apache/hadoop/streaming/io/TypedBytesInputWriter.class JarBuilder.addNamedStream org/apache/hadoop/streaming/io/TextInputWriter.class JarBuilder.addNamedStream org/apache/hadoop/streaming/io/InputWriter.class JarBuilder.addNamedStream org/apache/hadoop/streaming/io/TextOutputReader.class JarBuilder.addNamedStream org/apache/hadoop/streaming/io/IdentifierResolver.class JarBuilder.addNamedStream org/apache/hadoop/streaming/io/RawBytesInputWriter.class JarBuilder.addNamedStream org/apache/hadoop/streaming/io/TypedBytesOutputReader.class JarBuilder.addNamedStream org/apache/hadoop/streaming/io/OutputReader.class JarBuilder.addNamedStream org/apache/hadoop/streaming/PipeMapRed.class JarBuilder.addNamedStream org/apache/hadoop/streaming/PathFinder.class JarBuilder.addNamedStream org/apache/hadoop/streaming/LoadTypedBytes.class JarBuilder.addNamedStream org/apache/hadoop/streaming/StreamXmlRecordReader.class JarBuilder.addNamedStream org/apache/hadoop/streaming/UTF8ByteArrayUtils.class JarBuilder.addNamedStream org/apache/hadoop/streaming/JarBuilder.class JarBuilder.addNamedSt