Hi all,

I know this question has probably been posed multiple times, but I'm having 
difficulty figuring out a couple of aspects of a custom LoaderFunc to read from 
a DB. And yes, I did try to Google my way to an answer. Anyhoo, for what it's 
worth, I have a MySql table that I wish to load via Pig. I have the LoaderFunc 
working using PigServer in a Java app, but I noticed the following when my job 
gets submitted to my MR cluster. I generated 6 InputSplits in my custom 
InputFormat, where each split specifies a non-overlapping range/page of records 
to read from. I thought that each InputSplit would correspond to a map task, 
but what I see in the JobTracker is that the submitted job only has 1 map task 
which executes each split serially. Is my understanding even correct that a 
split can be effectively assigned to a single map task? If so, can I coerce the 
submitted MR job to properly get each of my splits to execute in its own map 
task?

Thanks,
-Terry

Reply via email to