Hans,

You can change memory requirements for tasks of a single job, but not
of a single task inside that job.

This is briefly how the 0.20 framework (by default) works: TT has
notions only of "slots", and carries a maximum _number_ of
simultaneous slots it may run. It does not know of what each task,
occupying one slot, would demand in resource-terms. Your job then
supplies a # of map tasks, and amount of memory required per map task
in general, as a configuration. TTs then merely start the task JVMs
with the provided heap configuration.

On Sun, Mar 11, 2012 at 11:24 AM, Hans Uhlig <huh...@uhlisys.com> wrote:
> That was a typo in my email not in the configuration. Is the memory reserved
> for the tasks when the task tracker starts? You seem to be suggesting that I
> need to set the memory to be the same for all map tasks. Is there no way to
> override for a single map task?
>
>
> On Sat, Mar 10, 2012 at 8:41 PM, Harsh J <ha...@cloudera.com> wrote:
>>
>> Hans,
>>
>> Its possible you may have an typo issue: mapred.map.child.jvm.opts -
>> Such a property does not exist. Perhaps you wanted
>> "mapred.map.child.java.opts"?
>>
>> Additionally, the computation you need to do is (# of map slots on a
>> TT * per-map-task-heap-requirement) should be at least < (Total RAM -
>> 2/3 GB). With your 4 GB requirement, I guess you can support a max of
>> 6-7 slots per machine (i.e. Not counting reducer heap requirements in
>> parallel).
>>
>> On Sun, Mar 11, 2012 at 9:30 AM, Hans Uhlig <huh...@uhlisys.com> wrote:
>> > I am attempting to speed up a mapping process whose input is GZIP
>> > compressed
>> > CSV files. The files range from 1-2GB, I am running on a Cluster where
>> > each
>> > node has a total of 32GB memory available to use. I have attempted to
>> > tweak
>> > mapred.map.child.jvm.opts with -Xmx4096mb and io.sort.mb to 2048 to
>> > accommodate the size but I keep getting java heap errors or other memory
>> > related problems. My row count per mapper is well below
>> > Integer.MAX_INTEGER
>> > limit by several orders of magnitude and the box is NOT using anywhere
>> > close
>> > to its full memory allotment. How can I specify that this map task can
>> > have
>> > 3-4 GB of memory for the collection, partition and sort process without
>> > constantly spilling records to disk?
>>
>>
>>
>> --
>> Harsh J
>
>



-- 
Harsh J

Reply via email to