gt;> >> I will go through this..
>> >>
>> >> Sent from my iPhone
>> >> On May 25, 2011, at 7:51 AM, Juwei Shi wrote:
>> >>
>> >> The following are suitable for hadoop 0.20.2.
>> >>
>> >> 2011/5/25 Ju
I will go through this..
> >>
> >> Sent from my iPhone
> >> On May 25, 2011, at 7:51 AM, Juwei Shi wrote:
> >>
> >> The following are suitable for hadoop 0.20.2.
> >>
> >> 2011/5/25 Juwei Shi
> >>>
> >>> Th
;> The input split size is detemined by map.min.split.size, dfs.block.size
>> and mapred.map.tasks.
>>
>> goalSize = totalSize / mapred.map.tasks
>> minSize = max {mapred.min.split.size, minSplitSize}
>> splitSize= max (minSize, min(goalSize, dfs.bloc
>
> goalSize = totalSize / mapred.map.tasks
> minSize = max {mapred.min.split.size, minSplitSize}
> splitSize= max (minSize, min(goalSize, dfs.block.size))
>
> minSplitSize is determined by each InputFormat such as
> SequenceFileInputFormat.
>
> You may want
The following are suitable for hadoop 0.20.2.
2011/5/25 Juwei Shi
> The input split size is detemined by map.min.split.size, dfs.block.size and
> mapred.map.tasks.
>
> goalSize = totalSize / mapred.map.tasks
> minSize = max {mapred.min.split.size, minSplitSize}
> splitSize=
The input split size is detemined by map.min.split.size, dfs.block.size and
mapred.map.tasks.
goalSize = totalSize / mapred.map.tasks
minSize = max {mapred.min.split.size, minSplitSize}
splitSize= max (minSize, min(goalSize, dfs.block.size))
minSplitSize is determined by each InputFormat such as
Resending >
> Hi,
> I have few input splits that are few MB in size.
> I want to submit 1 GB of input to every mapper. Does anyone know how can I do
> it ?
> Currently each mapper gets one input split that results in many small
> map-output files.
>
> I tried setting -Dmapred.map.min.spli
Hi,
I have few input splits that are few MB in size.
I want to submit 1 GB of input to every mapper. How can I do it ?
Currently each mapper gets one input split that results in many small
map-output files.
I tried setting -Dmapred.map.min.split.size= , but still it does not
take effect.
Thanks,
As I understand, mapred.min.split.size defines the minimum size of a
split. In the case below:
(1) HDFS block size = 32MB, mapred.min.split.size=64MB
(mapred.min.split.size can be only set to larger than HDFS block size)
when I run mapreduce, it means that a map will run one input split of
64MB
El 3/18/2011 3:54 PM, Pedro Costa escribió:
Hi
What's the purpose of the parameter "mapred.min.split.size"?
Thanks,
There are many parameters that control the number of map tasks for a
Job, and mapred.min.split.size controls the minimun size of a split.
Other
e of the parameter "mapred.min.split.size"?
>
> Thanks,
> --
> Pedro
>
Hi
What's the purpose of the parameter "mapred.min.split.size"?
Thanks,
--
Pedro
12 matches
Mail list logo