That sounds good. There are some related issue. see https://issues.apache.org/jira/browse/HBASE-4914 and https://issues.apache.org/jira/browse/HBASE-4063.
On 2017-09-04 15:06, libis <libistha...@gmail.com> wrote: > Hi > > When TableInputFormat is used to source an HBase table in a MapReduce job, > its splitter will make a map task for each region of the table. However, in > some cases, the userâs scan range may locate in a single region, resulting > in there is a only mapper. For example, the rowkey of the table is > âmd5(userid) + timestampâ, once client want to scan the data of a > specified > user in the latest month with MR, itâs much possible that there is only one > mapper working. > > In order to scan data in parallel if the user's scan range located in a > single region, should we split the scan range into serveral segments within > a region? > > Best, > > xinxin >