I tryed this other configuration in mapred-site.xml and now it works (I get _SUCCESS file and expected output) but I think that it is not the optimal configuration!
<property> <name>mapreduce.map.memory.mb</name> <value>1024</value> </property> <property> <name>mapreduce.reduce.memory.mb</name> <value>1024</value> </property> Il giorno mar 2 ott 2018 alle ore 12:09 Francesco Sclano < sclano.france...@gmail.com> ha scritto: > I'm using giraph-1.3.0-SNAPSHOT and hadoop-2.8.4 on Amazon EC2 cluster. My > cluster is composed by 4 t2.large machines each having 8 GB RAM and 2 cpus > (In future I'll have to use 20 c3.8xlarge machines, each having 60 GB RAM > and 32 CPU). > I'm blocked on this problem: "Giraph's estimated cluster heap xxxxMBs ask > is greater than the current available cluster heap of 0MB. Aborting Job". I > red this previous post > https://stackoverflow.com/questions/28977337/giraphs-estimated-cluster-heap-4096mb-ask-is-greater-than-the-current-available > but I didn't understand what caused the problem in my case since I have > configured yarn.resourcemanager.hostname (see below) and my security group > is open to all traffic. Maybe I miss some settings (or some ports)? > > > Furthermore, I have following questions: > - Since Giraph doesn't use reduce but only map, is it correct to assign > less memory to mapreduce.reduce.memory.mb than the memory assigned to > mapreduce.map.memory.mb? Maybe it could be right to assign even 0 MBs to > mapreduce.reduce.memory.mb since giraph doesn't use reduce? > - I red http://giraph.apache.org/quick_start.html that > mapred.tasktracker.map.tasks.maximum and mapred.map.tasks must be set to 4 > since "by default hadoop allows 2 mappers to run at once. Giraph's code, > however, assumes that we can run 4 mappers at the same time." Hence 4 value > must be always set to these properties? > > > > This is my configuration. I reported only mapred-site.xml and > yarn-site.xml because on others hadoop configuration files I'm quite sure > that they are correct. > > > mapred-site.xml > > > <configuration> > <property> > <name>mapreduce.jobtracker.address</name> > <value>{HOSTNAME}:54311</value> > </property> > <property> > <name>mapreduce.framework.name</name> > <value>yarn</value> > </property> > <property> > <name>mapred.tasktracker.map.tasks.maximum</name> > <value>4</value> > </property> > <property> > <name>mapred.map.tasks</name> > <value>4</value> > </property> > <property> > <name>mapreduce.map.memory.mb</name> > <value>4608</value> > </property> > <property> > <name>mapreduce.reduce.memory.mb</name> > <value>512</value> > </property> > </configuration> > > > > > yarn-site.xml > > <configuration> > <property> > <name>yarn.nodemanager.aux-services</name> > <value>mapreduce_shuffle</value> > </property> > <property> > <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> > <value>org.apache.hadoop.mapred.ShuffleHandler</value> > </property> > <property> > <name>yarn.resourcemanager.hostname</name> > <value>{HOSTNAME}</value> > </property> > <property> > <name>yarn.nodemanager.resource.cpu-vcores</name> > <value>2</value> > </property> > <property> > <name>yarn.app.mapreduce.am.resource.mb</name> > <value>2048</value> > </property> > <property> > <name>yarn.nodemanager.resource.memory-mb</name> > <value>6144</value> > </property> > <property> > <name>yarn.scheduler.maximum-allocation-mb</name> > <value>6144</value> > </property> > <property> > <name>yarn.scheduler.minimum-allocation-mb</name> > <value>512</value> > </property> > <property> > <name>yarn.nodemanager.vmem-check-enabled</name> > <value>false</value> > </property> > </configuration> > > -- > Francesco Sclano > -- Francesco Sclano