Re: Reading in parallel from table's regions in MapReduce

Doug Meil Tue, 04 Sep 2012 08:32:53 -0700

Hi there-

Yes, there is an input split for each region of the source table of a MR
job.


There is a blurb on that in the RefGuide...

http://hbase.apache.org/book.html#splitter





On 9/4/12 11:17 AM, "Ioakim Perros" <imper...@gmail.com> wrote:

>Hello,
>
>I would be grateful if someone could shed a light to the following:
>
>Each M/R map task is reading data from a separate region of a table.
> From the jobtracker 's GUI, at the map completion graph, I notice that
>although data read from mappers are different, they read data
>sequentially - like the table has a lock that permits only one mapper to
>read data from every region at a time.
>
>Does this "lock" hypothesis make sense? Is there any way I could avoid
>this useless delay?
>
>Thanks in advance and regards,
>Ioakim
>

Re: Reading in parallel from table's regions in MapReduce

Reply via email to