This "multiple mappers for one region" request could be useful for CPU intensive map job.
Together with earlier feedback from Cosmin Lehene, maybe we can just allow application to config the number of mappers via TableInputFormat. https://issues.apache.org/jira/browse/HBASE-4063 -----Original Message----- From: saint....@gmail.com [mailto:saint....@gmail.com] On Behalf Of Stack Sent: Thursday, June 30, 2011 10:58 PM To: dev@hbase.apache.org Subject: Re: TableInputFormat improvement to handle lots of small regions On Thu, Jun 30, 2011 at 8:38 AM, Ophir Cohen <oph...@gmail.com> wrote: > Actually I thought of opposite version: > If I have a spare map slots why not configure it to run more than one mapper > on region? > The question then is how to 'skip' the mappers to the needed places inside > the regions. Well, the current splitter passed mappers Scans where the start/end rows are the region boundaries (at the time at which the splitter ran). To do your case, in the splitter, you'd just give out multiple splits per region. To cut up the region key-space, you might use the Bytes.split code. It does coarse BigNumber math dividing the key space. See here: http://hbase.apache.org/xref/org/apache/hadoop/hbase/util/Bytes.html#1034 St.Ack