Set number of mappers by the number of input lines for a single file?

biro lehel Sun, 20 May 2012 02:19:16 -0700

Dear all,

I have one single input file, which contains, on every line, some hydrological 
calibration models (data). Each line of the file should be processed and then 
the output from every line written to another single output file.


I understood that hadoop spawns mapper tasks with the same number as how many 
input files there are (meaning, in my case, a single mapper would be 
generated). However, I want that a mapper to be dealing with only a single line 
from my input file (nr. of mapper tasks =  number of lines in my file). 

What is the best way to obtain such behavior? How should I specify this to 
Hadoop?

Any suggestions are more than welcome.

Thank you,
Lehel.

Set number of mappers by the number of input lines for a single file?

Reply via email to