Re: Insight on why distcp becomes slower when adding nodemanager

Alexandre Fouche Thu, 01 Nov 2012 13:01:41 -0700

I was using Yarn and HDFS on EC2 and EBS, with default memory settings.

I have just read in the Hadoop guide a list of hadoop benchmark jars. I guess 
i'll use Whirr to create a canned hadoop cluster on ec2, and run these 
benchmarks. So i will have a baseline to which i can compare. Then i'll compare 
with my own install of hadoop stack



--
Alexandre Fouche
Lead operations engineer, cloud architect
http://www.cleverscale.com | @cleverscale
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)


On Wednesday 31 October 2012 at 21:27, Marcos Ortiz wrote:

>  
> On 10/31/2012 02:23 PM, Michael Segel wrote:
> > Not sure.  
> >  
> > Lots of things can effect your throughput.   
> > Networking is my first guess. Which is why I asked about the number of 
> > times you've run the same test to see if there is a wide variation in 
> > timings.  
> >  
> > On Oct 31, 2012, at 7:37 AM, Alexandre Fouche 
> > <alexandre.fou...@cleverscale.com 
> > (mailto:alexandre.fou...@cleverscale.com)> wrote:  
> > > These instances have no swap. I tried 5 or 6 times in a row, and modified 
> > > the yarn.nodemanager.resource.memory-mb but it did not help. Later on, 
> > > i'll replace the openjdk with the Oracle java SE 1.6.31 to see if it 
> > > improves overall performance.  
> How many RAM do you have, and how much of it  is assigned to your Hadoop 
> services?
>  
> > > Now i am running everything on medium instances for prototyping, and 
> > > while this is better, i still find it abusively slow. Maybe bad hadoop 
> > > performance on less than xlarge memory instances is to be expected on EC2 
> > > ? Are you using Hadoop on top of EC2 or are you using the EMR service?
>  
> > >  
> > >  
> > > --
> > > Alexandre Fouche
> > > Lead operations engineer, cloud architect
> > > http://www.cleverscale.com | @cleverscale  
> > > Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
> > >  
> > >  
> > > On Monday 29 October 2012 at 20:04, Michael Segel wrote:
> > >  
> > > > how many times did you test it?
> > > >  
> > > > need to rule out aberrations.  
> > > >  
> > > > On Oct 29, 2012, at 11:30 AM, Harsh J <ha...@cloudera.com 
> > > > (mailto:ha...@cloudera.com)> wrote:  
> > > >  
> > > > > On your second low-memory NM instance, did you ensure to lower the  
> > > > > yarn.nodemanager.resource.memory-mb property specifically to avoid
> > > > > swapping due to excessive resource grants? The default offered is 8 GB
> > > > > (>> 1.7 GB you have).
> > > > >  
> > > > > On Mon, Oct 29, 2012 at 8:42 PM, Alexandre Fouche  
> > > > > <alexandre.fou...@cleverscale.com 
> > > > > (mailto:alexandre.fou...@cleverscale.com)> wrote:
> > > > > > Hi,
> > > > > >  
> > > > > > Can someone give some insight on why a "distcp" of 600 files of a 
> > > > > > few  
> > > > > > hundred bytes from s3n:// to local hdfs is taking 46s when using a
> > > > > > yarn-nodemanager EC2 instance with 16GB memory (which by the way i 
> > > > > > think is
> > > > > > jokingly long), and taking 3mn30s when adding a second 
> > > > > > yarn-nodemanager (a
> > > > > > small instance with 1.7GB memory) ?
> > > > > > I would have expected it to be a bit faster, not 5xlonger !
> > > > > >  
> > > > > > I have the same issue when i stop the small instance nodemanager 
> > > > > > and restart  
> > > > > > it to join the processing after the big nodemanager instance was 
> > > > > > already
> > > > > > submitted the job.
> > > > > >  
> > > > > > I am using Cloudera latest Yarn+HDFS on Amazon (rebranded Centos 6) 
> > > > > >  
> > > > > >  
> > > > > > #Staging 14:58:04 root@datanode2:hadoop-yarn: rpm -qa |grep hadoop  
> > > > > > hadoop-hdfs-datanode-2.0.0+545-1.cdh4.1.1.p0.5.el6.x86_64
> > > > > > hadoop-mapreduce-2.0.0+545-1.cdh4.1.1.p0.5.el6.x86_64
> > > > > > hadoop-0.20-mapreduce-0.20.2+1261-1.cdh4.1.1.p0.4.el6.x86_64
> > > > > > hadoop-yarn-nodemanager-2.0.0+545-1.cdh4.1.1.p0.5.el6.x86_64
> > > > > > hadoop-mapreduce-historyserver-2.0.0+545-1.cdh4.1.1.p0.5.el6.x86_64
> > > > > > hadoop-hdfs-2.0.0+545-1.cdh4.1.1.p0.5.el6.x86_64
> > > > > > hadoop-client-2.0.0+545-1.cdh4.1.1.p0.5.el6.x86_64
> > > > > > hadoop-2.0.0+545-1.cdh4.1.1.p0.5.el6.x86_64
> > > > > > hadoop-yarn-2.0.0+545-1.cdh4.1.1.p0.5.el6.x86_64
> > > > > >  
> > > > > >  
> > > > > > #Staging 14:39:51 root@resourcemanager:hadoop-yarn:  
> > > > > > HADOOP_MAPRED_HOME=/usr/lib/hadoop-mapreduce time hadoop distcp 
> > > > > > -overwrite
> > > > > > s3n://xxx:xxx@s3n.hadoop.cwsdev (mailto:xxx@s3n.hadoop.cwsdev)/* 
> > > > > > hdfs:///tmp/something/a
> > > > > >  
> > > > > > 12/10/29 14:40:12 INFO tools.DistCp: Input Options:  
> > > > > > DistCpOptions{atomicCommit=false, syncFolder=false, 
> > > > > > deleteMissing=false,
> > > > > > ignoreFailures=false, maxMaps=20, sslConfigurationFile='null',
> > > > > > copyStrategy='uniformsize', sourceFileListing=null,
> > > > > > sourcePaths=[s3n://xxx:xxx@s3n.hadoop.cwsdev 
> > > > > > (mailto:xxx@s3n.hadoop.cwsdev)/*],
> > > > > > targetPath=hdfs:/tmp/something/a}
> > > > > > 12/10/29 14:40:18 WARN conf.Configuration: io.sort.mb is deprecated.
> > > > > > Instead, use mapreduce.task.io.sort.mb
> > > > > > 12/10/29 14:40:18 WARN conf.Configuration: io.sort.factor is 
> > > > > > deprecated.
> > > > > > Instead, use mapreduce.task.io.sort.factor
> > > > > > 12/10/29 14:40:19 INFO mapreduce.JobSubmitter: number of splits:15
> > > > > > 12/10/29 14:40:19 WARN conf.Configuration: mapred.jar is deprecated.
> > > > > > Instead, use mapreduce.job.jar
> > > > > > 12/10/29 14:40:19 WARN conf.Configuration:
> > > > > > mapred.map.tasks.speculative.execution is deprecated. Instead, use
> > > > > > mapreduce.map.speculative
> > > > > > 12/10/29 14:40:19 WARN conf.Configuration: mapred.reduce.tasks is
> > > > > > deprecated. Instead, use mapreduce.job.reduces
> > > > > > 12/10/29 14:40:19 WARN conf.Configuration: 
> > > > > > mapred.mapoutput.value.class
> > > > > > is deprecated. Instead, use mapreduce.map.output.value.class
> > > > > > 12/10/29 14:40:19 WARN conf.Configuration: mapreduce.map.class is
> > > > > > deprecated. Instead, use mapreduce.job.map.class
> > > > > > 12/10/29 14:40:19 WARN conf.Configuration: mapred.job.name 
> > > > > > (http://mapred.job.name/) is
> > > > > > deprecated. Instead, use mapreduce.job.name 
> > > > > > (http://mapreduce.job.name/)
> > > > > > 12/10/29 14:40:19 WARN conf.Configuration: 
> > > > > > mapreduce.inputformat.class
> > > > > > is deprecated. Instead, use mapreduce.job.inputformat.class
> > > > > > 12/10/29 14:40:19 WARN conf.Configuration: mapred.output.dir is
> > > > > > deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
> > > > > > 12/10/29 14:40:19 WARN conf.Configuration: 
> > > > > > mapreduce.outputformat.class
> > > > > > is deprecated. Instead, use mapreduce.job.outputformat.class
> > > > > > 12/10/29 14:40:19 WARN conf.Configuration: mapred.map.tasks is
> > > > > > deprecated. Instead, use mapreduce.job.maps
> > > > > > 12/10/29 14:40:19 WARN conf.Configuration: 
> > > > > > mapred.mapoutput.key.class is
> > > > > > deprecated. Instead, use mapreduce.map.output.key.class
> > > > > > 12/10/29 14:40:19 WARN conf.Configuration: mapred.working.dir is
> > > > > > deprecated. Instead, use mapreduce.job.working.dir
> > > > > > 12/10/29 14:40:20 INFO mapred.ResourceMgrDelegate: Submitted 
> > > > > > application
> > > > > > application_1351504801306_0015 to ResourceManager at
> > > > > > resourcemanager.cwsdev.cleverscale.com/10.60.106.130:8032 
> > > > > > (http://resourcemanager.cwsdev.cleverscale.com/10.60.106.130:8032)
> > > > > > 12/10/29 14:40:20 INFO mapreduce.Job: The url to track the job:
> > > > > > http://ip-10-60-106-130.ec2.internal:8088/proxy/application_1351504801306_0015/
> > > > > > 12/10/29 14:40:20 INFO tools.DistCp: DistCp job-id:
> > > > > > job_1351504801306_0015
> > > > > > 12/10/29 14:40:20 INFO mapreduce.Job: Running job:
> > > > > > job_1351504801306_0015
> > > > > > 12/10/29 14:40:27 INFO mapreduce.Job: Job job_1351504801306_0015 
> > > > > > running
> > > > > > in uber mode : false
> > > > > > 12/10/29 14:40:27 INFO mapreduce.Job: map 0% reduce 0%
> > > > > > 12/10/29 14:40:42 INFO mapreduce.Job: map 6% reduce 0%
> > > > > > 12/10/29 14:40:43 INFO mapreduce.Job: map 33% reduce 0%
> > > > > > 12/10/29 14:40:44 INFO mapreduce.Job: map 40% reduce 0%
> > > > > > 12/10/29 14:40:48 INFO mapreduce.Job: map 46% reduce 0%
> > > > > > 12/10/29 14:43:04 INFO mapreduce.Job: map 56% reduce 0%
> > > > > > 12/10/29 14:43:05 INFO mapreduce.Job: map 58% reduce 0%
> > > > > > 12/10/29 14:43:08 INFO mapreduce.Job: map 62% reduce 0%
> > > > > > 12/10/29 14:43:09 INFO mapreduce.Job: map 68% reduce 0%
> > > > > > 12/10/29 14:43:15 INFO mapreduce.Job: map 75% reduce 0%
> > > > > > 12/10/29 14:43:16 INFO mapreduce.Job: map 82% reduce 0%
> > > > > > 12/10/29 14:43:25 INFO mapreduce.Job: map 85% reduce 0%
> > > > > > 12/10/29 14:43:26 INFO mapreduce.Job: map 87% reduce 0%
> > > > > > 12/10/29 14:43:29 INFO mapreduce.Job: map 90% reduce 0%
> > > > > > 12/10/29 14:43:35 INFO mapreduce.Job: map 93% reduce 0%
> > > > > > 12/10/29 14:43:37 INFO mapreduce.Job: map 96% reduce 0%
> > > > > > 12/10/29 14:43:40 INFO mapreduce.Job: map 100% reduce 0%
> > > > > > 12/10/29 14:43:40 INFO mapreduce.Job: Job job_1351504801306_0015
> > > > > > completed successfully
> > > > > > 12/10/29 14:43:40 INFO mapreduce.Job: Counters: 35
> > > > > > File System Counters
> > > > > > FILE: Number of bytes read=1800
> > > > > > FILE: Number of bytes written=1050895
> > > > > > FILE: Number of read operations=0
> > > > > > FILE: Number of large read operations=0
> > > > > > FILE: Number of write operations=0
> > > > > > HDFS: Number of bytes read=22157
> > > > > > HDFS: Number of bytes written=101379
> > > > > > HDFS: Number of read operations=519
> > > > > > HDFS: Number of large read operations=0
> > > > > > HDFS: Number of write operations=201
> > > > > > S3N: Number of bytes read=101379
> > > > > > S3N: Number of bytes written=0
> > > > > > S3N: Number of read operations=0
> > > > > > S3N: Number of large read operations=0
> > > > > > S3N: Number of write operations=0
> > > > > > Job Counters
> > > > > > Launched map tasks=15
> > > > > > Other local map tasks=15
> > > > > > Total time spent by all maps in occupied slots (ms)=12531208
> > > > > > Total time spent by all reduces in occupied slots (ms)=0
> > > > > > Map-Reduce Framework
> > > > > > Map input records=57
> > > > > > Map output records=0
> > > > > > Input split bytes=2010
> > > > > > Spilled Records=0
> > > > > > Failed Shuffles=0
> > > > > > Merged Map outputs=0
> > > > > > GC time elapsed (ms)=42324
> > > > > > CPU time spent (ms)=54890
> > > > > > Physical memory (bytes) snapshot=2923872256
> > > > > > Virtual memory (bytes) snapshot=12526301184
> > > > > > Total committed heap usage (bytes)=1618280448
> > > > > > File Input Format Counters
> > > > > > Bytes Read=20147
> > > > > > File Output Format Counters
> > > > > > Bytes Written=0
> > > > > > org.apache.hadoop.tools.mapred.CopyMapper$Counter
> > > > > > BYTESCOPIED=101379
> > > > > > BYTESEXPECTED=101379
> > > > > > COPY=57
> > > > > >  
> > > > > > 6.90user 0.59system 3:29.17elapsed 3%CPU (0avgtext+0avgdata  
> > > > > > 819392maxresident)k
> > > > > > 0inputs+344outputs (0major+62847minor)pagefaults 0swaps
> > > > > >  
> > > > > >  
> > > > > >  
> > > > > > --  
> > > > > > Alexandre Fouche
> > > > > >  
> > > > >  
> > > > >  
> > > > >  
> > > > >  
> > > > > --  
> > > > > Harsh J
> > > > >  
> > > >  
> > > >  
> > >  
> > >  
> >  
> >  
> >  
 (http://www.uci.cu/)>  
> --  
> Marcos Luis Ortíz Valmaseda
> about.me/marcosortiz (http://about.me/marcosortiz)
> @marcosluis2186 (http://twitter.com/marcosluis2186)  
>  
>  
 (http://www.uci.cu/)

Re: Insight on why distcp becomes slower when adding nodemanager

Reply via email to