subject:"number of mapper tasks"

Re: number of mapper tasks

2013-01-29 Thread Marcelo Elias Del Valle

Hello, I have been able to make this work. I don't know why, but when but input file is zipped (read as a input stream) it creates only 1 mapper. However, when it's not zipped, it creates more mappers (running 3 instances it created 4 mappers and running 5 instances, it created 8 mappers).

Re: number of mapper tasks

2013-01-29 Thread Vinod Kumar Vavilapalli

Tried looking at your code, it's a bit involved. Instead of trying to run the job, try unit-testing your input format. Test for getSplits(), whatever number of splits that method returns, that will be the number of mappers that will run. You can also use LocalJobRunner also for this - set

Re: number of mapper tasks

2013-01-28 Thread Harsh J

I'm unfamiliar with EMR myself (perhaps the question fits EMR's own boards) but here's my take anyway: On Mon, Jan 28, 2013 at 9:24 PM, Marcelo Elias Del Valle mvall...@gmail.com wrote: Hello, I am using hadoop with TextInputFormat, a mapper and no reducers. I am running my jobs at Amazon

Re: number of mapper tasks

2013-01-28 Thread Marcelo Elias Del Valle

Sorry for asking too many questions, but the answers are really happening. 2013/1/28 Harsh J ha...@cloudera.com This seems CPU-oriented. You probably want the NLineInputFormat? See http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/lib/NLineInputFormat.html . This

Re: number of mapper tasks

2013-01-28 Thread Marcelo Elias Del Valle

Just to complement the last question, I have implemented the getSplits method in my input format: https://github.com/mvallebr/CSVInputFormat/blob/master/src/main/java/org/apache/hadoop/mapreduce/lib/input/CSVNLineInputFormat.java However, it still doesn't create more than 2 map tasks. Is there

Re: number of mapper tasks

2013-01-28 Thread Vinod Kumar Vavilapalli

Regarding your original question, you can use the min and max split settings to control the number of maps: http://hadoop.apache.org/docs/stable/api/org/apache/hadoop/mapreduce/lib/input/FileInputFormat.html . See #setMinInputSplitSize and #setMaxInputSplitSize. Or use mapred.min.split.size

Re: Is it pissible get a number of mapper tasks?

2010-12-03 Thread Harsh J

Hi, (Answers may be 0.20 specific) On Sat, Dec 4, 2010 at 6:41 AM, Jason urg...@gmail.com wrote: In my mapper code I need to know the total number of mappers which is the same as number of input splits. (I need it for unique int Id generation) mapred.map.tasks is set for every job before

Re: Is it pissible get a number of mapper tasks?

2010-12-03 Thread Harsh J

Minor correction, it is: mapred.tip.id is the task's id (contains various info about the task, map/reduce). mapred.task.id is the task's _attempt_ id (basically tip id, with attempt information, map/reduce). On Sat, Dec 4, 2010 at 7:29 AM, Harsh J qwertyman...@gmail.com wrote: Hi, (Answers may

Re: Is it pissible get a number of mapper tasks?

2010-12-03 Thread Jason

BTW, why not take task attempt id context.getTaskAttemptID() as the prefix of unique id ? The task attempt id for each task should be different The reason is that I would prefer to not have big gaps in my int id sequence, so i'd rather store mapper task ID in the low bits (suffix instead of

Re: specify different number of mapper tasks for different machines

2010-08-31 Thread Vitaliy Semochkin

the 3 machines (each with 8 cores), and the max number of mapper tasks is 8. I may use one of the 2 core machine as the master, but it turns out I need a powerful master. Is there any way to specify that some machines run, say, 8 mapper tasks, while some machines run only 2 tasks? What I can

Re: specify different number of mapper tasks for different machines

2010-08-30 Thread Shaojun Zhao

of mapper tasks is 8. I may use one of the 2 core machine as the master, but it turns out I need a powerful master. Is there any way to specify that some machines run, say, 8 mapper tasks, while some machines run only 2 tasks? What I can imagine is to extend the slave file, and have machine1:8

Re: specify different number of mapper tasks for different machines

2010-08-30 Thread Vitaliy Semochkin

machines, where I have 8 cores for 3 of them, but 2 cores for 2 of them, and the 8 core machines are more powerful (faster, more mem, more disk). Currently, I am using only the 3 machines (each with 8 cores), and the max number of mapper tasks is 8. I may use one of the 2 core machine

Re: number of mapper tasks

Re: number of mapper tasks

Re: number of mapper tasks

Re: number of mapper tasks

Re: number of mapper tasks

Re: number of mapper tasks

Re: Is it pissible get a number of mapper tasks?

Re: Is it pissible get a number of mapper tasks?

Re: Is it pissible get a number of mapper tasks?

Re: specify different number of mapper tasks for different machines

Re: specify different number of mapper tasks for different machines

Re: specify different number of mapper tasks for different machines

12 matches

Site Navigation

Mail list logo

Footer information