Re: Identification of mapper slots

2013-10-14 Thread Rahul Jain
I assume you know the tradeoff here: If you do depend upon mapper slot # in your implementation to speed it up, you are losing on code portability in long term That said, one way to achieve this is to use the JobConf API: int partition = jobConf.getInt(JobContext.TASK_PARTITION, -1); The fra

Re: Issue: Max block location exceeded for split error when running hive

2013-09-19 Thread Rahul Jain
a, then a different day I want to run > against 30 days? > > > > > On Thu, Sep 19, 2013 at 3:11 PM, Rahul Jain wrote: > >> I am assuming you have looked at this already: >> >> https://issues.apache.org/jira/browse/MAPREDUCE-5186 >> >> You do have a wo

Re: Issue: Max block location exceeded for split error when running hive

2013-09-19 Thread Rahul Jain
I am assuming you have looked at this already: https://issues.apache.org/jira/browse/MAPREDUCE-5186 You do have a workaround here to increase *mapreduce.job.max.split.locations *value in hive configuration, or do we need more than that here ? -Rahul On Thu, Sep 19, 2013 at 11:00 AM, Murtaza Do

Re: What happens when you have fewer input files than mapper slots?

2013-03-19 Thread Rahul Jain
Which version of hadoop are you using ? MRV1 or MRV2 (yarn) ?? For MRv2 (yarn): you can pretty much achieve this using: yarn.nodemanager.resource.memory-mb (system wide setting) and mapreduce.map.memory.mb (job level setting) e.g. if yarn.nodemanager.resource.memory-mb=100 and mapreduce.map.mem

Re: Time taken for launching Application Master

2013-01-20 Thread Rahul Jain
Check your node manager logs to understand the bottleneck first. When we had a similar issue on recent version of hadoop, which includes fix for MAPREDUCE-4068: we rearranged our job jar file to reduce time spent on 'expanding' the job jar file by the node manager(s). -Rahul On Sun, Jan 20, 2013