1. yes - I configure my job using this line: TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME, scan, ScanMapper.class, Text.class, MapWritable.class, job)
which internally uses TableInputFormat.class 2. One split per region ? What do you mean ? How do I do that ? 3. hbase version 0.90.2 4. no exceptions. the logs are very clean. On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu <yuzhih...@gmail.com> wrote: > Do you use TableInputFormat ? > To scan large number of rows, it would be better to produce one Split per > region. > > What HBase version do you use ? > Do you find any exception in master / region server logs around the moment > of timeout ? > > Cheers > > On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter <li...@infolinks.com> > wrote: > > > Hi all, > > I'm running a scan using the M/R framework. > > My table contains hundreds of millions of rows and I'm scanning using > > start/stop key about 50 million rows. > > > > The problem is that some map tasks get stuck and the task manager kills > > these maps after 600 seconds. When retrying the task everything works > fine > > (sometimes). > > > > To verify that the problem is in hbase (and not in the map code) I > removed > > all the code from my map function, so it looks like this: > > public void map(ImmutableBytesWritable key, Result value, Context > context) > > throws IOException, InterruptedException { > > } > > > > Also, when the map got stuck on a region, I tried to scan this region > > (using > > simple scan from a Java main) and it worked fine. > > > > Any ideas ? > > > > Thanks, > > Lior > > >