Yes, the no of regions is only 1 for that table. I kept a log statement inside map() and printed the InputSplit.getLength() from the map context.
If its only 1 region, then the data might be more for just 1 map to process. Can I set a lower size for a region? Thank you. Regards, Raghava. On Thu, Jun 17, 2010 at 1:41 AM, Stack <[email protected]> wrote: > Map tasks if you are using TableInputFormat will be equal to the > number of regions in your table. > > Region is the natural body of work for a Map task using hbase as a MR > job source. If little data in your table, splitting this way makes > little sense (You have one region only in your table, is that right?). > You could force splits of your region to make more via the UI or > shell? > > Otherwise, you need to make your own Splitter, one that has some > knowledge of the key space and is able to partition on other than > Region boundaries. > > See below... > > On Wed, Jun 16, 2010 at 10:36 PM, Raghava Mutharaju > <[email protected]> wrote: > > Hi all, > > > > I checked the size of the InputSplit in Map and it gave out 0. I was > > expecting some number indicating the size of split in bytes, that this > Map > > has received. Is this normal behavior? > > > > Where are you seeing this (so I can be sure I'm following along properly). > > St.Ack > > > > Another issue I am having is even though I set the mapred.map.tasks to a > > specific number (no of nodes*10), during execution, the no of map tasks > is > > always 1. I think this is related to the above issue. > > > > I am using HBase as the data source and sink. Previously, when I used > HDFS > > as data source, the no of map tasks were same as the one I used to set. I > am > > using HBase 0.20.4 > > > > Thank you. > > > > Regards, > > Raghava. > > >
