t;
> ?
>
> hope it helps
> Olivier
>
> Le 6 déc. 2012 à 00:24, Hans Uhlig a écrit :
>
> I am currently using multiple inputs to merge quite a few different but
> related filetypes and I am attempting to track down some bad data. However
> Multiple inputs shi
I am currently using multiple inputs to merge quite a few different but
related filetypes and I am attempting to track down some bad data. However
Multiple inputs shields the FileSplit behind a TaggedInputSplit and
map.input.file is no longer used. How can one get the file path now for
tracking dow
> supplies a # of map tasks, and amount of memory required per map task
> in general, as a configuration. TTs then merely start the task JVMs
> with the provided heap configuration.
>
> On Sun, Mar 11, 2012 at 11:24 AM, Hans Uhlig wrote:
> > That was a typo in my email no
-map-task-heap-requirement) should be at least < (Total RAM -
> 2/3 GB). With your 4 GB requirement, I guess you can support a max of
> 6-7 slots per machine (i.e. Not counting reducer heap requirements in
> parallel).
>
> On Sun, Mar 11, 2012 at 9:30 AM, Hans Uhlig wrote:
> >
I am attempting to specify this for a single job during its
creation/submission. Not via the general construct. I am using the new api
so I am adding the values to the conf passed into new Job();
2012/3/10 WangRamon
> How man map/reduce tasks slots do you have for each node? If the
> total numb
I am attempting to speed up a mapping process whose input is GZIP compressed
CSV files. The files range from 1-2GB, I am running on a Cluster where each
node has a total of 32GB memory available to use. I have attempted to tweak
mapred.map.child.jvm.opts with -Xmx4096mb and io.sort.mb to 2048 to ac