Re: Increase the number of mappers in PM mode
In MR2, to have more mappers executed per NM, your memory request for each map should be set such that the NM's configured memory allowance can fit in multiple requests. For example, if my NM memory is set to 16 GB assuming just 1 NM in cluster, and I submit a job with mapreduce.map.memory.mb and yarn.app.mapreduce.am.resource.mb both set to 1 GB, then the NM can execute 15 maps in parallel consuming upto 1 GB memory each (while using the remaining 1 GB for the AM to coordinate those executions). On Sat, Mar 16, 2013 at 10:16 AM, yypvsxf19870706 wrote: > hi: >i think i have got it . Thank you. > > 发自我的 iPhone > > 在 2013-3-15,18:32,Zheyi RONG 写道: > > Indeed you cannot explicitly set the number of mappers, but still you can > gain some control over it, by setting mapred.max.split.size, or > mapred.min.split.size. > > For example, if you have a file of 10GB (10737418240 B), you would like 10 > mappers, then each mapper has to deal with 1GB data. > According to "splitsize = max(minimumSize, min(maximumSize, blockSize))", > you can set mapred.min.split.size=1073741824 (1GB), i.e. > $hadoop jar -Dmapred.min.split.size=1073741824 yourjar yourargs > > It is well explained in thread: > http://stackoverflow.com/questions/9678180/change-file-split-size-in-hadoop > . > > Regards, > Zheyi. > > On Fri, Mar 15, 2013 at 8:49 AM, YouPeng Yang > wrote: > >> s > > > > -- Harsh J
Re: Increase the number of mappers in PM mode
hi: i think i have got it . Thank you. 发自我的 iPhone 在 2013-3-15,18:32,Zheyi RONG 写道: > Indeed you cannot explicitly set the number of mappers, but still you can > gain some control over it, by setting mapred.max.split.size, or > mapred.min.split.size. > > For example, if you have a file of 10GB (10737418240 B), you would like 10 > mappers, then each mapper has to deal with 1GB data. > According to "splitsize = max(minimumSize, min(maximumSize, blockSize))", you > can set mapred.min.split.size=1073741824 (1GB), i.e. > $hadoop jar -Dmapred.min.split.size=1073741824 yourjar yourargs > > It is well explained in thread: > http://stackoverflow.com/questions/9678180/change-file-split-size-in-hadoop. > > Regards, > Zheyi. > > On Fri, Mar 15, 2013 at 8:49 AM, YouPeng Yang > wrote: >> s > >
Re: Increase the number of mappers in PM mode
Indeed you cannot explicitly set the number of mappers, but still you can gain some control over it, by setting mapred.max.split.size, or mapred.min.split.size. For example, if you have a file of 10GB (10737418240 B), you would like 10 mappers, then each mapper has to deal with 1GB data. According to "splitsize = max(minimumSize, min(maximumSize, blockSize))", you can set mapred.min.split.size=1073741824 (1GB), i.e. $hadoop jar -Dmapred.min.split.size=1073741824 yourjar yourargs It is well explained in thread: http://stackoverflow.com/questions/9678180/change-file-split-size-in-hadoop. Regards, Zheyi. On Fri, Mar 15, 2013 at 8:49 AM, YouPeng Yang wrote: > s
Re: Increase the number of mappers in PM mode
HI: i get these interview questions by doing some googles: Q29. How can you set an arbitary number of mappers to be created for a job in Hadoop This is a trick question. You cannot set it >> The above test proves you cannot an arbitary number of mappers . Q30. How can you set an arbitary number of reducers to be created for a job in Hadoop You can either do it progamatically by using method setNumReduceTasksin the JobConfclass or set it up as a configuration setting I test the Q30,it seems right. my logs: [hadoop@Hadoop01 bin]$./hadoop jar ../share/hadoop/mapreduce/hadoop-mapreduce-examples-2.0.0-cdh4.1.2.jar wordcount -D mapreduce.job.reduces=2 -D mapreduce.jobtracker.address= 10.167.14.221:50030 /user/hadoop/yyp/input /user/hadoop/yyp/output3 === Job Counters Launched map tasks=1 Launched reduce tasks=2 -> it actually changed . Rack-local map tasks=1 Total time spent by all maps in occupied slots (ms)=60356 Total time spent by all reduces in occupied slots (ms)=135224 regards 2013/3/14 YouPeng Yang > Hi > the docs only have a property > : mapreduce.input.fileinputformat.split.minsize (default value is 0) > does it matter? > > > > 2013/3/14 Zheyi RONG > >> Have you considered change mapred.max.split.size ? >> As in: >> http://stackoverflow.com/questions/9678180/change-file-split-size-in-hadoop >> >> Zheyi >> >> >> On Thu, Mar 14, 2013 at 3:27 PM, YouPeng Yang >> wrote: >> >>> Hi >>> >>> >>> I have done some tests in my Pseudo Mode(CDH4.1.2)with MV2 yarn,and >>> : >>> According to the doc: >>> *mapreduce.jobtracker.address :*The host and port that the MapReduce >>> job tracker runs at. If "local", then jobs are run in-process as a single >>> map and reduce task. >>> *mapreduce.job.maps (default value is 2)* :The default number of map >>> tasks per job. Ignored when mapreduce.jobtracker.address is "local". >>> >>> I changed the mapreduce.jobtracker.address = Hadoop:50031. >>> >>> And then run the wordcount examples: >>> hadoop jar hadoop-mapreduce-examples-2.0.0-cdh4.1.2.jar wordcount >>> input output >>> >>> the output logs are as follows: >>> >>>Job Counters >>> Launched map tasks=1 >>> Launched reduce tasks=1 >>> Data-local map tasks=1 >>> Total time spent by all maps in occupied slots (ms)=60336 >>> Total time spent by all reduces in occupied slots (ms)=63264 >>> Map-Reduce Framework >>> Map input records=5 >>> Map output records=7 >>> Map output bytes=56 >>> Map output materialized bytes=76 >>> >>> >>> i seem to does not work. >>> >>> I thought maybe my input file is small-just 5 records . is it right? >>> >>> regards >>> >>> >>> >>> >>> >>> >>> >>> 2013/3/14 Sai Sai >>> In Pseudo Mode where is the setting to increase the number of mappers or is this not possible. Thanks Sai >>> >>> >> >
Re: Increase the number of mappers in PM mode
Hi the docs only have a property : mapreduce.input.fileinputformat.split.minsize (default value is 0) does it matter? 2013/3/14 Zheyi RONG > Have you considered change mapred.max.split.size ? > As in: > http://stackoverflow.com/questions/9678180/change-file-split-size-in-hadoop > > Zheyi > > > On Thu, Mar 14, 2013 at 3:27 PM, YouPeng Yang > wrote: > >> Hi >> >> >> I have done some tests in my Pseudo Mode(CDH4.1.2)with MV2 yarn,and >> : >> According to the doc: >> *mapreduce.jobtracker.address :*The host and port that the MapReduce >> job tracker runs at. If "local", then jobs are run in-process as a single >> map and reduce task. >> *mapreduce.job.maps (default value is 2)* :The default number of map >> tasks per job. Ignored when mapreduce.jobtracker.address is "local". >> >> I changed the mapreduce.jobtracker.address = Hadoop:50031. >> >> And then run the wordcount examples: >> hadoop jar hadoop-mapreduce-examples-2.0.0-cdh4.1.2.jar wordcount >> input output >> >> the output logs are as follows: >> >>Job Counters >> Launched map tasks=1 >> Launched reduce tasks=1 >> Data-local map tasks=1 >> Total time spent by all maps in occupied slots (ms)=60336 >> Total time spent by all reduces in occupied slots (ms)=63264 >> Map-Reduce Framework >> Map input records=5 >> Map output records=7 >> Map output bytes=56 >> Map output materialized bytes=76 >> >> >> i seem to does not work. >> >> I thought maybe my input file is small-just 5 records . is it right? >> >> regards >> >> >> >> >> >> >> >> 2013/3/14 Sai Sai >> >>> >>> >>> In Pseudo Mode where is the setting to increase the number of mappers >>> or is this not possible. >>> Thanks >>> Sai >>> >> >> >
Re: Increase the number of mappers in PM mode
Have you considered change mapred.max.split.size ? As in: http://stackoverflow.com/questions/9678180/change-file-split-size-in-hadoop Zheyi On Thu, Mar 14, 2013 at 3:27 PM, YouPeng Yang wrote: > Hi > > > I have done some tests in my Pseudo Mode(CDH4.1.2)with MV2 yarn,and : > According to the doc: > *mapreduce.jobtracker.address :*The host and port that the MapReduce > job tracker runs at. If "local", then jobs are run in-process as a single > map and reduce task. > *mapreduce.job.maps (default value is 2)* :The default number of map > tasks per job. Ignored when mapreduce.jobtracker.address is "local". > > I changed the mapreduce.jobtracker.address = Hadoop:50031. > > And then run the wordcount examples: > hadoop jar hadoop-mapreduce-examples-2.0.0-cdh4.1.2.jar wordcount > input output > > the output logs are as follows: > >Job Counters > Launched map tasks=1 > Launched reduce tasks=1 > Data-local map tasks=1 > Total time spent by all maps in occupied slots (ms)=60336 > Total time spent by all reduces in occupied slots (ms)=63264 > Map-Reduce Framework > Map input records=5 > Map output records=7 > Map output bytes=56 > Map output materialized bytes=76 > > > i seem to does not work. > > I thought maybe my input file is small-just 5 records . is it right? > > regards > > > > > > > > 2013/3/14 Sai Sai > >> >> >> In Pseudo Mode where is the setting to increase the number of mappers or >> is this not possible. >> Thanks >> Sai >> > >
Re: Increase the number of mappers in PM mode
Hi I have done some tests in my Pseudo Mode(CDH4.1.2)with MV2 yarn,and : According to the doc: *mapreduce.jobtracker.address :*The host and port that the MapReduce job tracker runs at. If "local", then jobs are run in-process as a single map and reduce task. *mapreduce.job.maps (default value is 2)* :The default number of map tasks per job. Ignored when mapreduce.jobtracker.address is "local". I changed the mapreduce.jobtracker.address = Hadoop:50031. And then run the wordcount examples: hadoop jar hadoop-mapreduce-examples-2.0.0-cdh4.1.2.jar wordcount input output the output logs are as follows: Job Counters Launched map tasks=1 Launched reduce tasks=1 Data-local map tasks=1 Total time spent by all maps in occupied slots (ms)=60336 Total time spent by all reduces in occupied slots (ms)=63264 Map-Reduce Framework Map input records=5 Map output records=7 Map output bytes=56 Map output materialized bytes=76 i seem to does not work. I thought maybe my input file is small-just 5 records . is it right? regards 2013/3/14 Sai Sai > > > In Pseudo Mode where is the setting to increase the number of mappers or > is this not possible. > Thanks > Sai >
Re: Increase the number of mappers in PM mode
In Pseudo Mode where is the setting to increase the number of mappers or is this not possible. Thanks Sai