I assume you know the tradeoff here: If you do depend upon mapper slot # in
your implementation to speed it up, you are losing on code portability in
long term
That said, one way to achieve this is to use the JobConf API:
int partition = jobConf.getInt(JobContext.TASK_PARTITION, -1);
The fra
a, then a different day I want to run
> against 30 days?
>
>
>
>
> On Thu, Sep 19, 2013 at 3:11 PM, Rahul Jain wrote:
>
>> I am assuming you have looked at this already:
>>
>> https://issues.apache.org/jira/browse/MAPREDUCE-5186
>>
>> You do have a wo
I am assuming you have looked at this already:
https://issues.apache.org/jira/browse/MAPREDUCE-5186
You do have a workaround here to increase *mapreduce.job.max.split.locations
*value in hive configuration, or do we need more than that here ?
-Rahul
On Thu, Sep 19, 2013 at 11:00 AM, Murtaza Do
Which version of hadoop are you using ? MRV1 or MRV2 (yarn) ??
For MRv2 (yarn): you can pretty much achieve this using:
yarn.nodemanager.resource.memory-mb (system wide setting)
and
mapreduce.map.memory.mb (job level setting)
e.g. if yarn.nodemanager.resource.memory-mb=100
and mapreduce.map.mem
Check your node manager logs to understand the bottleneck first. When we
had a similar issue on recent version of hadoop, which includes fix for
MAPREDUCE-4068: we rearranged our job jar file to reduce time spent on
'expanding' the job jar file by the node manager(s).
-Rahul
On Sun, Jan 20, 2013