OK, I have watched the jira. 2017-09-05 15:22 GMT+08:00 Chia-Ping Tsai <chia7...@apache.org>:
> Yeah, 16894 is also a similar one. Maybe Yi Liang still work on this. Move > this discussion to the jira. > > On 2017-09-05 09:53, libis <libistha...@gmail.com> wrote: > > Thanks for Mikhail. I am pleasure to pick HBASE-18090 up (my jira account > > is xinxin fan). i notice that the issue HBASE-16894( > > https://issues.apache.org/jira/browse/HBASE-16894) tries to work on the > > similar thing. Chia-Ping, look it? > > > > 2017-09-04 20:41 GMT+08:00 Chia-Ping Tsai <chia7...@apache.org>: > > > > > Thanks for the information. Mikhail. It seems to me the issue is > popular. > > > libis, Could you take HBASE-18090 over? I can assign the issue to you > if i > > > get ur jira account. > > > > > > On 2017-09-04 20:26, Mikhail Antonov <olorinb...@gmail.com> wrote: > > > > I've filed https://issues.apache.org/jira/browse/HBASE-18090 some > time > > > ago > > > > and attached draft patch to it. It's not complete as we need some > deeper > > > > changes in the way we open regions (see comments) but basic stuff > works > > > (I > > > > ended up going the other route and didn't have bandwidth to finish > that - > > > > would be great if someone picked it up) > > > > > > > > Mikhail > > > > > > > > On Mon, Sep 4, 2017 at 11:13 AM Chia-Ping Tsai <chia7...@apache.org> > > > wrote: > > > > > > > > > That sounds good. There are some related issue. see > > > > > https://issues.apache.org/jira/browse/HBASE-4914 and > > > > > https://issues.apache.org/jira/browse/HBASE-4063. > > > > > > > > > > On 2017-09-04 15:06, libis <libistha...@gmail.com> wrote: > > > > > > Hi > > > > > > > > > > > > When TableInputFormat is used to source an HBase table in a > MapReduce > > > > > job, > > > > > > its splitter will make a map task for each region of the table. > > > However, > > > > > in > > > > > > some cases, the user’s scan range may locate in a single region, > > > > > resulting > > > > > > in there is a only mapper. For example, the rowkey of the table > is > > > > > > ‘md5(userid) + timestamp’, once client want to scan the data of a > > > > > specified > > > > > > user in the latest month with MR, it’s much possible that there > is > > > only > > > > > one > > > > > > mapper working. > > > > > > > > > > > > In order to scan data in parallel if the user's scan range > located > > > in a > > > > > > single region, should we split the scan range into serveral > segments > > > > > within > > > > > > a region? > > > > > > > > > > > > Best, > > > > > > > > > > > > xinxin > > > > > > > > > > > > > > > -- > > > > Thanks, > > > > Michael Antonov > > > > > > > > > >