Re: Why my tests shows Yarn is worse than MRv1 for terasort?

sam liu Sun, 09 Jun 2013 01:52:57 -0700

Hi Harsh,

According to above suggestions, I removed the duplication of setting, and
reduce the value of 'yarn.nodemanager.resource.cpu-cores', '
yarn.nodemanager.vcores-pcores-ratio' and '
yarn.nodemanager.resource.memory-mb' to 16, 8 and 12000. Ant then, the
efficiency improved about 18%.  I have questions:


- How to know the container number? Why you say it will be 22 containers
due to a 22 GB memory?
- My machine has 32 GB memory, how many memory is proper to be assigned to
containers?
- In mapred-site.xml, if I set 'mapreduce.framework.name' to be 'yarn',
will other parameters for mapred-site.xml still work in yarn framework?
Like 'mapreduce.task.io.sort.mb' and 'mapreduce.map.sort.spill.percent'

Thanks!



2013/6/8 Harsh J <ha...@cloudera.com>

> Hey Sam,
>
> Did you get a chance to retry with Sandy's suggestions? The config
> appears to be asking NMs to use roughly 22 total containers (as
> opposed to 12 total tasks in MR1 config) due to a 22 GB memory
> resource. This could impact much, given the CPU is still the same for
> both test runs.
>
> On Fri, Jun 7, 2013 at 12:23 PM, Sandy Ryza <sandy.r...@cloudera.com>
> wrote:
> > Hey Sam,
> >
> > Thanks for sharing your results.  I'm definitely curious about what's
> > causing the difference.
> >
> > A couple observations:
> > It looks like you've got yarn.nodemanager.resource.memory-mb in there
> twice
> > with two different values.
> >
> > Your max JVM memory of 1000 MB is (dangerously?) close to the default
> > mapreduce.map/reduce.memory.mb of 1024 MB. Are any of your tasks getting
> > killed for running over resource limits?
> >
> > -Sandy
> >
> >
> > On Thu, Jun 6, 2013 at 10:21 PM, sam liu <samliuhad...@gmail.com> wrote:
> >>
> >> The terasort execution log shows that reduce spent about 5.5 mins from
> 33%
> >> to 35% as below.
> >> 13/06/10 08:02:22 INFO mapreduce.Job:  map 100% reduce 31%
> >> 13/06/10 08:02:25 INFO mapreduce.Job:  map 100% reduce 32%
> >> 13/06/10 08:02:46 INFO mapreduce.Job:  map 100% reduce 33%
> >> 13/06/10 08:08:16 INFO mapreduce.Job:  map 100% reduce 35%
> >> 13/06/10 08:08:19 INFO mapreduce.Job:  map 100% reduce 40%
> >> 13/06/10 08:08:22 INFO mapreduce.Job:  map 100% reduce 43%
> >>
> >> Any way, below are my configurations for your reference. Thanks!
> >> (A) core-site.xml
> >> only define 'fs.default.name' and 'hadoop.tmp.dir'
> >>
> >> (B) hdfs-site.xml
> >>   <property>
> >>     <name>dfs.replication</name>
> >>     <value>1</value>
> >>   </property>
> >>
> >>   <property>
> >>     <name>dfs.name.dir</name>
> >>     <value>/opt/hadoop-2.0.4-alpha/temp/hadoop/dfs_name_dir</value>
> >>   </property>
> >>
> >>   <property>
> >>     <name>dfs.data.dir</name>
> >>     <value>/opt/hadoop-2.0.4-alpha/temp/hadoop/dfs_data_dir</value>
> >>   </property>
> >>
> >>   <property>
> >>     <name>dfs.block.size</name>
> >>     <value>134217728</value><!-- 128MB -->
> >>   </property>
> >>
> >>   <property>
> >>     <name>dfs.namenode.handler.count</name>
> >>     <value>64</value>
> >>   </property>
> >>
> >>   <property>
> >>     <name>dfs.datanode.handler.count</name>
> >>     <value>10</value>
> >>   </property>
> >>
> >> (C) mapred-site.xml
> >>   <property>
> >>     <name>mapreduce.cluster.temp.dir</name>
> >>     <value>/opt/hadoop-2.0.4-alpha/temp/hadoop/mapreduce_temp</value>
> >>     <description>No description</description>
> >>     <final>true</final>
> >>   </property>
> >>
> >>   <property>
> >>     <name>mapreduce.cluster.local.dir</name>
> >>
> <value>/opt/hadoop-2.0.4-alpha/temp/hadoop/mapreduce_local_dir</value>
> >>     <description>No description</description>
> >>     <final>true</final>
> >>   </property>
> >>
> >> <property>
> >>   <name>mapreduce.child.java.opts</name>
> >>   <value>-Xmx1000m</value>
> >> </property>
> >>
> >> <property>
> >>     <name>mapreduce.framework.name</name>
> >>     <value>yarn</value>
> >>    </property>
> >>
> >>  <property>
> >>     <name>mapreduce.tasktracker.map.tasks.maximum</name>
> >>     <value>8</value>
> >>   </property>
> >>
> >>   <property>
> >>     <name>mapreduce.tasktracker.reduce.tasks.maximum</name>
> >>     <value>4</value>
> >>   </property>
> >>
> >>
> >>   <property>
> >>     <name>mapreduce.tasktracker.outofband.heartbeat</name>
> >>     <value>true</value>
> >>   </property>
> >>
> >> (D) yarn-site.xml
> >>  <property>
> >>     <name>yarn.resourcemanager.resource-tracker.address</name>
> >>     <value>node1:18025</value>
> >>     <description>host is the hostname of the resource manager and
> >>     port is the port on which the NodeManagers contact the Resource
> >> Manager.
> >>     </description>
> >>   </property>
> >>
> >>   <property>
> >>     <description>The address of the RM web application.</description>
> >>     <name>yarn.resourcemanager.webapp.address</name>
> >>     <value>node1:18088</value>
> >>   </property>
> >>
> >>
> >>   <property>
> >>     <name>yarn.resourcemanager.scheduler.address</name>
> >>     <value>node1:18030</value>
> >>     <description>host is the hostname of the resourcemanager and port is
> >> the port
> >>     on which the Applications in the cluster talk to the Resource
> Manager.
> >>     </description>
> >>   </property>
> >>
> >>
> >>   <property>
> >>     <name>yarn.resourcemanager.address</name>
> >>     <value>node1:18040</value>
> >>     <description>the host is the hostname of the ResourceManager and the
> >> port is the port on
> >>     which the clients can talk to the Resource Manager. </description>
> >>   </property>
> >>
> >>   <property>
> >>     <name>yarn.nodemanager.local-dirs</name>
> >>     <value>/opt/hadoop-2.0.4-alpha/temp/hadoop/yarn_nm_local_dir</value>
> >>     <description>the local directories used by the
> >> nodemanager</description>
> >>   </property>
> >>
> >>   <property>
> >>     <name>yarn.nodemanager.address</name>
> >>     <value>0.0.0.0:18050</value>
> >>     <description>the nodemanagers bind to this port</description>
> >>   </property>
> >>
> >>   <property>
> >>     <name>yarn.nodemanager.resource.memory-mb</name>
> >>     <value>10240</value>
> >>     <description>the amount of memory on the NodeManager in
> >> GB</description>
> >>   </property>
> >>
> >>   <property>
> >>     <name>yarn.nodemanager.remote-app-log-dir</name>
> >>     <value>/opt/hadoop-2.0.4-alpha/temp/hadoop/yarn_nm_app-logs</value>
> >>     <description>directory on hdfs where the application logs are moved
> to
> >> </description>
> >>   </property>
> >>
> >>    <property>
> >>     <name>yarn.nodemanager.log-dirs</name>
> >>     <value>/opt/hadoop-2.0.4-alpha/temp/hadoop/yarn_nm_log</value>
> >>     <description>the directories used by Nodemanagers as log
> >> directories</description>
> >>   </property>
> >>
> >>   <property>
> >>     <name>yarn.nodemanager.aux-services</name>
> >>     <value>mapreduce.shuffle</value>
> >>     <description>shuffle service that needs to be set for Map Reduce to
> >> run </description>
> >>   </property>
> >>
> >>   <property>
> >>     <name>yarn.resourcemanager.client.thread-count</name>
> >>     <value>64</value>
> >>   </property>
> >>
> >>  <property>
> >>     <name>yarn.nodemanager.resource.cpu-cores</name>
> >>     <value>24</value>
> >>   </property>
> >>
> >> <property>
> >>     <name>yarn.nodemanager.vcores-pcores-ratio</name>
> >>     <value>3</value>
> >>   </property>
> >>
> >>  <property>
> >>     <name>yarn.nodemanager.resource.memory-mb</name>
> >>     <value>22000</value>
> >>   </property>
> >>
> >>  <property>
> >>     <name>yarn.nodemanager.vmem-pmem-ratio</name>
> >>     <value>2.1</value>
> >>   </property>
> >>
> >>
> >>
> >> 2013/6/7 Harsh J <ha...@cloudera.com>
> >>>
> >>> Not tuning configurations at all is wrong. YARN uses memory resource
> >>> based scheduling and hence MR2 would be requesting 1 GB minimum by
> >>> default, causing, on base configs, to max out at 8 (due to 8 GB NM
> >>> memory resource config) total containers. Do share your configs as at
> >>> this point none of us can tell what it is.
> >>>
> >>> Obviously, it isn't our goal to make MR2 slower for users and to not
> >>> care about such things :)
> >>>
> >>> On Fri, Jun 7, 2013 at 8:45 AM, sam liu <samliuhad...@gmail.com>
> wrote:
> >>> > At the begining, I just want to do a fast comparision of MRv1 and
> Yarn.
> >>> > But
> >>> > they have many differences, and to be fair for comparison I did not
> >>> > tune
> >>> > their configurations at all.  So I got above test results. After
> >>> > analyzing
> >>> > the test result, no doubt, I will configure them and do comparison
> >>> > again.
> >>> >
> >>> > Do you have any idea on current test result? I think, to compare with
> >>> > MRv1,
> >>> > Yarn is better on Map phase(teragen test), but worse on Reduce
> >>> > phase(terasort test).
> >>> > And any detailed suggestions/comments/materials on Yarn performance
> >>> > tunning?
> >>> >
> >>> > Thanks!
> >>> >
> >>> >
> >>> > 2013/6/7 Marcos Luis Ortiz Valmaseda <marcosluis2...@gmail.com>
> >>> >>
> >>> >> Why not to tune the configurations?
> >>> >> Both frameworks have many areas to tune:
> >>> >> - Combiners, Shuffle optimization, Block size, etc
> >>> >>
> >>> >>
> >>> >>
> >>> >> 2013/6/6 sam liu <samliuhad...@gmail.com>
> >>> >>>
> >>> >>> Hi Experts,
> >>> >>>
> >>> >>> We are thinking about whether to use Yarn or not in the near
> future,
> >>> >>> and
> >>> >>> I ran teragen/terasort on Yarn and MRv1 for comprison.
> >>> >>>
> >>> >>> My env is three nodes cluster, and each node has similar hardware:
> 2
> >>> >>> cpu(4 core), 32 mem. Both Yarn and MRv1 cluster are set on the same
> >>> >>> env. To
> >>> >>> be fair, I did not make any performance tuning on their
> >>> >>> configurations, but
> >>> >>> use the default configuration values.
> >>> >>>
> >>> >>> Before testing, I think Yarn will be much better than MRv1, if they
> >>> >>> all
> >>> >>> use default configuration, because Yarn is a better framework than
> >>> >>> MRv1.
> >>> >>> However, the test result shows some differences:
> >>> >>>
> >>> >>> MRv1: Hadoop-1.1.1
> >>> >>> Yarn: Hadoop-2.0.4
> >>> >>>
> >>> >>> (A) Teragen: generate 10 GB data:
> >>> >>> - MRv1: 193 sec
> >>> >>> - Yarn: 69 sec
> >>> >>> Yarn is 2.8 times better than MRv1
> >>> >>>
> >>> >>> (B) Terasort: sort 10 GB data:
> >>> >>> - MRv1: 451 sec
> >>> >>> - Yarn: 1136 sec
> >>> >>> Yarn is 2.5 times worse than MRv1
> >>> >>>
> >>> >>> After a fast analysis, I think the direct cause might be that Yarn
> is
> >>> >>> much faster than MRv1 on Map phase, but much worse on Reduce phase.
> >>> >>>
> >>> >>> Here I have two questions:
> >>> >>> - Why my tests shows Yarn is worse than MRv1 for terasort?
> >>> >>> - What's the stratage for tuning Yarn performance? Is any
> materials?
> >>> >>>
> >>> >>> Thanks!
> >>> >>
> >>> >>
> >>> >>
> >>> >>
> >>> >> --
> >>> >> Marcos Ortiz Valmaseda
> >>> >> Product Manager at PDVSA
> >>> >> http://about.me/marcosortiz
> >>> >>
> >>> >
> >>>
> >>>
> >>>
> >>> --
> >>> Harsh J
> >>
> >>
> >
>
>
>
> --
> Harsh J
>

Re: Why my tests shows Yarn is worse than MRv1 for terasort?

Reply via email to