Hey Alex,

Our cluster uses Cloudera for Hadoop admin stuff. To the best of my
knowledge, we need to do our configs through the Cloudera GUI (which
is a royal pain).

The "mapred.tasktracker.map.tasks.maximum" should be at 16 everywhere
(save for a few of the higher nodes which already had it at 32).
However, I can't figure how to get at "mapred.map.tasks" from the
Cloudera GUI. Can you give it another try?


On Tue, Jul 30, 2013 at 8:58 AM, Kyle Orlando <kyle.r.orla...@gmail.com> wrote:
> Ah, that might be your problem.
> Try adding this between the <configuration> and </configuration>:
> <property>
>   <name>mapred.tasktracker.map.tasks.maximum</name>
>   <value>4</value>
> </property>
> <property>
>   <name>mapred.map.tasks</name>
>   <value>4</value>
> </property>
> See if it works, that's really the only thing I can think of.
> By default, the max number of map tasks and reduce tasksfor Hadoop is
> 2.  This changes the max number of map tasks to 4, and "hints" to
> Hadoop (whatever that means) that it should utilize 4 map tasks.  I
> believe that Giraph workers hijack map tasks, so reduce tasks are
> unneeded, but someone who is more familiar with Giraph will have to
> tell you more.
> On Tue, Jul 30, 2013 at 11:38 AM, Alex Waagen <awaa...@gmail.com> wrote:
>> Here is the file. It is almost empty.
>> <?xml version="1.0"?>
>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>> <!-- Put site-specific property overrides in this file. -->
>> <configuration>
>> </configuration>
>> On Tue, Jul 30, 2013 at 8:33 AM, Kyle Orlando <kyle.r.orla...@gmail.com>
>> wrote:
>>> Hmmm, could you post the contents of your mapred-site.xml in
>>> $HADOOP_HOME/conf?  You may need to increase the number of map tasks.
>>> On Tue, Jul 30, 2013 at 11:02 AM, Alex Waagen <awaa...@gmail.com> wrote:
>>> > I am having some trouble getting these examples running. I’m using
>>> > giraph
>>> > version 1.1.0 and hadoop 0.20.2. I am using the following json file as
>>> > input:
>>> >
>>> > [0,0,[[1,1],[3,3]]]
>>> > [1,0,[[0,1],[2,2],[3,1]]]
>>> > [2,0,[[1,2],[4,4]]]
>>> > [3,0,[[0,3],[1,1],[4,4]]]
>>> > [4,0,[[3,4],[2,4]]]
>>> >
>>> > The command I use is:
>>> >
>>> > hadoop jar
>>> >
>>> > /path-to-giraph/giraph-core/target/giraph-1.1.0-SNAPSHOT-for-hadoop-
>>> > org.apache.giraph.GiraphRunner
>>> > org.apache.giraph.examples.SimpleShortestPathsComputation -vif
>>> > org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat
>>> > -vip
>>> > /path-to-input/input_file.json -of
>>> > org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op
>>> > /path-to-output/outShortest -w 1
>>> >
>>> >
>>> > I see the following output.
>>> >
>>> > 13/07/29 14:36:06 INFO utils.ConfigurationUtils: No edge input format
>>> > specified. Ensure your InputFormat does not require one.
>>> > 13/07/29 14:36:06 INFO job.GiraphJob: run: Since checkpointing is
>>> > disabled
>>> > (default), do not allow any task retries (setting
>>> > mapred.map.max.attempts =
>>> > 0, old value = 4)
>>> > 13/07/29 14:36:20 INFO mapred.JobClient: Running job:
>>> > job_201307232135_0588
>>> > 13/07/29 14:36:21 INFO mapred.JobClient: map 0% reduce 0%
>>> > 13/07/29 14:36:52 INFO mapred.JobClient: map 50% reduce 0%
>>> > 13/07/29 14:47:24 INFO mapred.JobClient: map 0% reduce 0%
>>> > 13/07/29 14:47:39 INFO mapred.JobClient: Job complete:
>>> > job_201307232135_0588
>>> > 13/07/29 14:47:39 INFO mapred.JobClient: Counters: 6
>>> > 13/07/29 14:47:39 INFO mapred.JobClient: Job Counters
>>> > 13/07/29 14:47:39 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=670508
>>> > 13/07/29 14:47:39 INFO mapred.JobClient: Total time spent by all reduces
>>> > waiting after reserving slots (ms)=0
>>> > 13/07/29 14:47:39 INFO mapred.JobClient: Total time spent by all maps
>>> > waiting after reserving slots (ms)=0
>>> > 13/07/29 14:47:39 INFO mapred.JobClient: Launched map tasks=2
>>> > 13/07/29 14:47:39 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=0
>>> > 13/07/29 14:47:39 INFO mapred.JobClient: Failed map tasks=1
>>> >
>>> >
>>> > When I check the job tracker, I see that two map jobs were killed, with
>>> > the
>>> > following errors:
>>> >
>>> > java.lang.Throwable: Child Error
>>> > at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:242)
>>> > Caused by: java.io.IOException: Task process exit with nonzero status of
>>> > 1.
>>> > at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:229)
>>> >
>>> > java.lang.IllegalStateException: run: Caught an unrecoverable exception
>>> > exists: Failed to check
>>> >
>>> > /_hadoopBsp/job_201307232135_0588/_applicationAttemptsDir/0/_superstepDir/-1/_addressesAndPartitions
>>> > after 3 tries!
>>> > at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:101)
>>> > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
>>> > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
>>> > at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>>> > at java.security.AccessController.doPrivileged(Native Method)
>>> > at javax.security.auth.Subject.doAs(Subject.java:396)
>>> > at
>>> >
>>> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157)
>>> > at org.apache.hadoop.mapred.Child.main(Child.java:264)
>>> > Caused by: java.lang.IllegalStateException: exists: Failed to check
>>> >
>>> > /_hadoopBsp/job_201307232135_0588/_applicationAttemptsDir/0/_superstepDir/-1/_addressesAndPartitions
>>> > after 3 tries!
>>> > at org.apache.giraph.zk.ZooKeeperExt.exists(ZooKeeperExt.java:369)
>>> > at org.apache.giraph.worker.BspServiceWorker.s
>>> > Task attempt_201307232135_0588_m_000001_0 failed to report status for
>>> > 600
>>> > seconds. Killing!
>>> >
>>> > Any idea what the problem is?
>>> > Thanks in advance.
>>> --
>>> Kyle Orlando
>>> Computer Engineering Major
>>> University of Maryland
> --
> Kyle Orlando
> Computer Engineering Major
> University of Maryland

Reply via email to