Re: Is it pissible get a number of mapper tasks?

2010-12-03 Thread Jason
Great! mapred.map.tasks and mapred.task.partition work perfectly for me, even for the local job runner. Thanks On Dec 3, 2010, at 5:59 PM, Harsh J wrote: > Hi, > > (Answers may be 0.20 specific) > > On Sat, Dec 4, 2010 at 6:41 AM, Jason wrote: >> In my mapper code I need to know the total

Re: Is it pissible get a number of mapper tasks?

2010-12-03 Thread Jason
> BTW, why not take task attempt id context.getTaskAttemptID() as the > prefix of unique id ? The task attempt id for each task should be > different The reason is that I would prefer to not have big gaps in my int id sequence, so i'd rather store mapper task ID in the low bits (suffix instead of

Re: Is it pissible get a number of mapper tasks?

2010-12-03 Thread Harsh J
Hi, On Sat, Dec 4, 2010 at 7:36 AM, Shrijeet Paliwal wrote: > A note on mapred.map.tasks. > Hadoop does not honor mapred.map.tasks all the time. It is just a hint > for the framework, actual number of map tasks launched may be > different. *I think*. > This is true pre-job-submission. The InputF

Re: Query

2010-12-03 Thread Harsh J
Hi, On Sat, Dec 4, 2010 at 6:03 AM, Jon Lederman wrote: > > 2) Will Hadoop leverage multiple cores automatically or do I need to set some > options in the configuration to do this?  How does Hadoop distribute > map/reduce jobs in a multi-core setting? You need to configure a couple of options

Re: Is it pissible get a number of mapper tasks?

2010-12-03 Thread Shrijeet Paliwal
>>mapred.map.tasks is set for every job before launch and is the total >>number of maps that are going to run for a successful result. A note on mapred.map.tasks. Hadoop does not honor mapred.map.tasks all the time. It is just a hint for the framework, actual number of map tasks launched may be di

Re: Is it pissible get a number of mapper tasks?

2010-12-03 Thread Harsh J
Minor correction, it is: mapred.tip.id is the task's id (contains various info about the task, map/reduce). mapred.task.id is the task's _attempt_ id (basically tip id, with attempt information, map/reduce). On Sat, Dec 4, 2010 at 7:29 AM, Harsh J wrote: > Hi, > > (Answers may be 0.20 specific) >

Re: Is it pissible get a number of mapper tasks?

2010-12-03 Thread Harsh J
Hi, (Answers may be 0.20 specific) On Sat, Dec 4, 2010 at 6:41 AM, Jason wrote: > In my mapper code I need to know the total number of mappers which is the > same as number of input splits. > (I need it for unique int Id generation) mapred.map.tasks is set for every job before launch and is th

Re: Is it pissible get a number of mapper tasks?

2010-12-03 Thread Jeff Zhang
You can use the following code to get the number of mapper task InputFormat inputForamt = ReflectionUtils.newInstance( context.getInputFormatClass(), context.getConfiguration()); int num

Is it pissible get a number of mapper tasks?

2010-12-03 Thread Jason
In my mapper code I need to know the total number of mappers which is the same as number of input splits. (I need it for unique int Id generation) Basically Im looking for an analog of context.getNumReduceTasks() but can't find it. Thanks >

Query

2010-12-03 Thread Jon Lederman
I am trying to run Hadoop on a multi-core architecture (e.g., 32 or more cores). I am somewhat unclear what the best configuration would be to maximize performance of Hadoop in this setting. 1) Since map/reduce jobs are i/o bound and all cores share the same i/o I am wondering whether any si