Re: Full table scan fails during map

stack Thu, 20 Nov 2008 10:27:47 -0800

David Alves wrote:

Hi guys
We've got HBase(0.18.0, r695089) and Hadoop(0.18.0, r686010)running for a while, and apart from the ocasional regionserverstopping without notice (and whithout explanations from what we cansee in the logs), problem that we solve easily just by restarting it,we now have come to face a more serious problem of what I think isdata loss.

What you think it is David? A hang? We've seen occasional hangups onHDFS. You could try threaddumping and see if you can figure wherethings are blocked (Can do it in UI on problematic regionserver or bysending QUIT to the JVM PID).

We use Hbase as a links and documents database (similar to nutch)in a 3 node cluster (4GB Mem on each node), the links database has a 4regions and the document database now has 200 regions for a total of216 (with meta and root).


How much RAM allocated to HBase?  Each database has a single family or more?

After the crawl task, which went ok, (we now have 60GB/300GB fullin hdfs) we proceed to do a full table scan to create the indexes andthats where things started to fail.We are seing a problem in the logs (at the end of this email).This repeats untils theres a retriesexausted exception and the taskfails in the map phase. Hadoop fsk tool tells us that hdfs is ok. I'mstill to explore the rest of the logs searching for some kind of errorI will post a new mail if I find anything.
    Any help would be greatly appreciated.

Is this file in your HDFS:hdfs://cyclops-prod-1:9000/hbase/document/153945136/docDatum/mapfiles/5163556575658593611/data?If so, can you fetch it using ./bin/hadoop fs -get FILENAME?


What crawler are you using (out of interest).
St.Ack

Re: Full table scan fails during map

Reply via email to