Hey Alex, Our cluster uses Cloudera for Hadoop admin stuff. To the best of my knowledge, we need to do our configs through the Cloudera GUI (which is a royal pain).
The "mapred.tasktracker.map.tasks.maximum" should be at 16 everywhere (save for a few of the higher nodes which already had it at 32). However, I can't figure how to get at "mapred.map.tasks" from the Cloudera GUI. Can you give it another try? -Ryan On Tue, Jul 30, 2013 at 8:58 AM, Kyle Orlando <kyle.r.orla...@gmail.com> wrote: > Ah, that might be your problem. > > Try adding this between the <configuration> and </configuration>: > > <property> > <name>mapred.tasktracker.map.tasks.maximum</name> > <value>4</value> > </property> > > <property> > <name>mapred.map.tasks</name> > <value>4</value> > </property> > > See if it works, that's really the only thing I can think of. > > By default, the max number of map tasks and reduce tasksfor Hadoop is > 2. This changes the max number of map tasks to 4, and "hints" to > Hadoop (whatever that means) that it should utilize 4 map tasks. I > believe that Giraph workers hijack map tasks, so reduce tasks are > unneeded, but someone who is more familiar with Giraph will have to > tell you more. > > On Tue, Jul 30, 2013 at 11:38 AM, Alex Waagen <awaa...@gmail.com> wrote: >> Here is the file. It is almost empty. >> >> <?xml version="1.0"?> >> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> >> >> <!-- Put site-specific property overrides in this file. --> >> >> <configuration> >> >> </configuration> >> >> >> >> On Tue, Jul 30, 2013 at 8:33 AM, Kyle Orlando <kyle.r.orla...@gmail.com> >> wrote: >>> >>> Hmmm, could you post the contents of your mapred-site.xml in >>> $HADOOP_HOME/conf? You may need to increase the number of map tasks. >>> >>> On Tue, Jul 30, 2013 at 11:02 AM, Alex Waagen <awaa...@gmail.com> wrote: >>> > I am having some trouble getting these examples running. I’m using >>> > giraph >>> > version 1.1.0 and hadoop 0.20.2. I am using the following json file as >>> > input: >>> > >>> > [0,0,[[1,1],[3,3]]] >>> > [1,0,[[0,1],[2,2],[3,1]]] >>> > [2,0,[[1,2],[4,4]]] >>> > [3,0,[[0,3],[1,1],[4,4]]] >>> > [4,0,[[3,4],[2,4]]] >>> > >>> > The command I use is: >>> > >>> > hadoop jar >>> > >>> > /path-to-giraph/giraph-core/target/giraph-1.1.0-SNAPSHOT-for-hadoop-0.20.203.0-jar-with-dependencies.jar >>> > org.apache.giraph.GiraphRunner >>> > org.apache.giraph.examples.SimpleShortestPathsComputation -vif >>> > org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat >>> > -vip >>> > /path-to-input/input_file.json -of >>> > org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op >>> > /path-to-output/outShortest -w 1 >>> > >>> > >>> > I see the following output. >>> > >>> > 13/07/29 14:36:06 INFO utils.ConfigurationUtils: No edge input format >>> > specified. Ensure your InputFormat does not require one. >>> > 13/07/29 14:36:06 INFO job.GiraphJob: run: Since checkpointing is >>> > disabled >>> > (default), do not allow any task retries (setting >>> > mapred.map.max.attempts = >>> > 0, old value = 4) >>> > 13/07/29 14:36:20 INFO mapred.JobClient: Running job: >>> > job_201307232135_0588 >>> > 13/07/29 14:36:21 INFO mapred.JobClient: map 0% reduce 0% >>> > 13/07/29 14:36:52 INFO mapred.JobClient: map 50% reduce 0% >>> > 13/07/29 14:47:24 INFO mapred.JobClient: map 0% reduce 0% >>> > 13/07/29 14:47:39 INFO mapred.JobClient: Job complete: >>> > job_201307232135_0588 >>> > 13/07/29 14:47:39 INFO mapred.JobClient: Counters: 6 >>> > 13/07/29 14:47:39 INFO mapred.JobClient: Job Counters >>> > 13/07/29 14:47:39 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=670508 >>> > 13/07/29 14:47:39 INFO mapred.JobClient: Total time spent by all reduces >>> > waiting after reserving slots (ms)=0 >>> > 13/07/29 14:47:39 INFO mapred.JobClient: Total time spent by all maps >>> > waiting after reserving slots (ms)=0 >>> > 13/07/29 14:47:39 INFO mapred.JobClient: Launched map tasks=2 >>> > 13/07/29 14:47:39 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=0 >>> > 13/07/29 14:47:39 INFO mapred.JobClient: Failed map tasks=1 >>> > >>> > >>> > When I check the job tracker, I see that two map jobs were killed, with >>> > the >>> > following errors: >>> > >>> > java.lang.Throwable: Child Error >>> > at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:242) >>> > Caused by: java.io.IOException: Task process exit with nonzero status of >>> > 1. >>> > at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:229) >>> > >>> > java.lang.IllegalStateException: run: Caught an unrecoverable exception >>> > exists: Failed to check >>> > >>> > /_hadoopBsp/job_201307232135_0588/_applicationAttemptsDir/0/_superstepDir/-1/_addressesAndPartitions >>> > after 3 tries! >>> > at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:101) >>> > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647) >>> > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323) >>> > at org.apache.hadoop.mapred.Child$4.run(Child.java:270) >>> > at java.security.AccessController.doPrivileged(Native Method) >>> > at javax.security.auth.Subject.doAs(Subject.java:396) >>> > at >>> > >>> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157) >>> > at org.apache.hadoop.mapred.Child.main(Child.java:264) >>> > Caused by: java.lang.IllegalStateException: exists: Failed to check >>> > >>> > /_hadoopBsp/job_201307232135_0588/_applicationAttemptsDir/0/_superstepDir/-1/_addressesAndPartitions >>> > after 3 tries! >>> > at org.apache.giraph.zk.ZooKeeperExt.exists(ZooKeeperExt.java:369) >>> > at org.apache.giraph.worker.BspServiceWorker.s >>> > Task attempt_201307232135_0588_m_000001_0 failed to report status for >>> > 600 >>> > seconds. Killing! >>> > >>> > Any idea what the problem is? >>> > Thanks in advance. >>> >>> >>> >>> -- >>> Kyle Orlando >>> Computer Engineering Major >>> University of Maryland >> >> > > > > -- > Kyle Orlando > Computer Engineering Major > University of Maryland