I have followed a suggestion on the given link, and set mapred.min.split.size
to 134217728.
With the above mapred.min.split.size, I get mapred.map.tasks =121
(previously it was 242).
Thanks for all the replies !
Shing
From: Romedius Weiss
I am running Hadoop 1.0.3 in Pseudo distributed mode.
When I submit a map/reduce job to process a file of size about 16 GB, in
job.xml, I have the following
mapred.map.tasks =242
mapred.min.split.size =0
dfs.block.size = 67108864
I would like to reduce mapred.map.tasks to see if it
Those numbers make sense, considering 1 map task per block. 16 GB file /
64 MB block size = ~242 map tasks.
When you doubled dfs.block.size, how did you accomplish that? Typically,
the block size is selected at file write time, with a default value from
system configuration used if not
Hi
You need to alter the value of mapred.max.split size to a value larger than
your block size to have less number of map tasks than the default.
On Tue, Oct 2, 2012 at 10:04 PM, Shing Hing Man mat...@yahoo.com wrote:
I am running Hadoop 1.0.3 in Pseudo distributed mode.
When I submit a
Sorry for the typo, the property name is mapred.max.split.size
Also just for changing the number of map tasks you don't need to modify the
hdfs block size.
On Tue, Oct 2, 2012 at 10:31 PM, Bejoy Ks bejoy.had...@gmail.com wrote:
Hi
You need to alter the value of mapred.max.split size to a
I have tried
Configuration.setInt(mapred.max.split.size,134217728);
and setting mapred.max.split.size in mapred-site.xml. ( dfs.block.size is left
unchanged at 67108864).
But in the job.xml, I am still getting mapred.map.tasks =242 .
Shing
Hi Shing
Is your input a single file or set of small files? If latter you need to use
CombineFileInputFormat.
Regards
Bejoy KS
Sent from handheld, please excuse typos.
-Original Message-
From: Shing Hing Man mat...@yahoo.com
Date: Tue, 2 Oct 2012 10:38:59
To:
I have done the following.
1) stop-all.sh
2) In mapred-site.xml, added
property
namemapred.max.split.size/name
value134217728/value
/property
(df.block.size remain unchanged at 67108864)
3) start-all.sh
4) Use hadoop fs -cp src destn, to copy my original file to another hdfs
I only have one big input file.
Shing
From: Bejoy KS bejoy.had...@gmail.com
To: user@hadoop.apache.org; Shing Hing Man mat...@yahoo.com
Sent: Tuesday, October 2, 2012 6:46 PM
Subject: Re: How to lower the total number of map tasks
Hi Shing
Is your input a