Re: should we split the scan range into serveral segments when the scan range only located in a single region?

Chia-Ping Tsai Mon, 04 Sep 2017 02:13:55 -0700

That sounds good. There are some related issue. see 
https://issues.apache.org/jira/browse/HBASE-4914 and 
https://issues.apache.org/jira/browse/HBASE-4063.


On 2017-09-04 15:06, libis <libistha...@gmail.com> wrote: 
> Hi
> 
> When TableInputFormat is used to source an HBase table in a MapReduce job,
> its splitter will make a map task for each region of the table. However, in
> some cases, the userâs scan range may locate in a single region, resulting
> in there is  a only mapper. For example, the rowkey of the table is
> âmd5(userid) + timestampâ, once client want to scan the data of a 
> specified
> user in the latest month with MR, itâs much possible that there is only one
> mapper working.
> 
> In order to scan data in parallel if the user's scan range located in a
> single region, should we split the scan range into serveral segments within
> a region?
> 
> Best,
> 
> xinxin
>

Re: should we split the scan range into serveral segments when the scan range only located in a single region?

Reply via email to