This is similar to a mail sent by another user to the group a couple of months
back.. I am quite new to Hbase and I've been trying to conduct a basic
experiment with Hbase..
1. I am trying to load 200 million records each record around 15 KB : with
one column value around 14KB and the rest of the 100 column values 8 bytes
each.. The 120 columns are grouped as 10 qualifiers X 12 families: hope I got
my jargon right.. Note that only one value is quite large for each doc (when
compared to other values)...
2. The data is uncompressed.. And each value is uniformly randomly selected..
3. I used a map-reduce job to load a data file on hdfs into the database..
Soon after the job finished, the region servers crash with OOM Exception..
Below is part of the trace from the logs in one of the RS's:
I have attached the conf along with the email: Can you guys point out any
anamoly in my settings? I have set a heap size of 3 gigs.. Anything
significantly more, java 32-bit doesn't run..
2010-05-12 19:22:45,068 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache:
Cache Stats: Sizes: Total=8.43782MB (8847696), Free=1791.2247MB (1878235312), M
ax=1799.6626MB (1887083008), Counts: Blocks=1, Access=16947, Hit=52,
Miss=16895, Evictions=0, Evicted=0, Ratios: Hit Ratio=0.3068389603868127%, Miss
Ratio=99
.69316124916077%, Evicted/Run=NaN
2010-05-12 19:22:45,069 DEBUG org.apache.hadoop.hbase.regionserver.Store:
loaded /hbase/DocData/1651418343/col5/7617863559659933969, isReference=false,
seque
nce id=2470632548, length=8456716, majorCompaction=false
2010-05-12 19:22:45,075 DEBUG org.apache.hadoop.hbase.regionserver.Store:
loaded /hbase/DocData/1651418343/col6/1328113038200437659, isReference=false,
seque
nce id=2960732840, length=19861, majorCompaction=false
2010-05-12 19:22:45,078 DEBUG org.apache.hadoop.hbase.regionserver.Store:
loaded /hbase/DocData/1651418343/col6/6484804359703635950, isReference=false,
seque
nce id=2470632548, length=8456716, majorCompaction=false
2010-05-12 19:22:45,082 DEBUG org.apache.hadoop.hbase.regionserver.Store:
loaded /hbase/DocData/1651418343/col7/1673569837212457160, isReference=false,
seque
nce id=2960732840, length=19861, majorCompaction=false
2010-05-12 19:22:45,085 DEBUG org.apache.hadoop.hbase.regionserver.Store:
loaded /hbase/DocData/1651418343/col7/4737399093829085995, isReference=false,
seque
nce id=2470632548, length=8456716, majorCompaction=false
2010-05-12 19:22:47,238 DEBUG org.apache.hadoop.hbase.regionserver.Store:
loaded /hbase/DocData/1651418343/col8/8446828932792437464, isReference=false,
seque
nce id=2960732840, length=19861, majorCompaction=false2010-05-12 19:22:47,241
DEBUG org.apache.hadoop.hbase.regionserver.Store: loaded
/hbase/DocData/1651418343/col8/974386128174268353, isReference=false, sequen
ce id=2470632548, length=8456716, majorCompaction=false
2010-05-12 19:22:48,804 DEBUG org.apache.hadoop.hbase.regionserver.Store:
loaded /hbase/DocData/1651418343/col9/2096232603557969237, isReference=false,
seque
nce id=2470632548, length=8456716, majorCompaction=false
2010-05-12 19:22:48,807 DEBUG org.apache.hadoop.hbase.regionserver.Store:
loaded /hbase/DocData/1651418343/col9/7088206045660348092, isReference=false,
seque
nce id=2960732840, length=19861, majorCompaction=false
2010-05-12 19:22:48,808 INFO org.apache.hadoop.hbase.regionserver.HRegion:
region DocData,4824176,1273625075099/1651418343 available; sequence id is
29607328
41
2010-05-12 19:22:48,808 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Worker: MSG_REGION_OPEN:
DocData,40682172,1273607630618
2010-05-12 19:22:48,809 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
Opening region DocData,40682172,1273607630618, encoded=271889952
2010-05-12 19:22:50,924 DEBUG org.apache.hadoop.hbase.regionserver.Store:
loaded /hbase/DocData/271889952/CONTENT/4859380626868896307, isReference=false,
sequence id=2959849236, length=337563, majorCompaction=false2010-05-12
19:22:53,037 DEBUG org.apache.hadoop.hbase.regionserver.Store: loaded
/hbase/DocData/271889952/CONTENT/952776139755887312, isReference=false, sequ
ence id=2082553088, length=110460013, majorCompaction=false
2010-05-12 19:22:57,404 DEBUG org.apache.hadoop.hbase.regionserver.Store:
loaded /hbase/DocData/271889952/col1/66449684560689857, isReference=false,
sequence
id=2959849236, length=12648, majorCompaction=false
2010-05-12 19:23:16,165 ERROR
org.apache.hadoop.hbase.regionserver.HRegionServer: Error opening
DocData,40682172,1273607630618
java.lang.OutOfMemoryError: Java heap space
at java.io.BufferedInputStream.<init>(BufferedInputStream.java:178)
at
org.apache.hadoop.hdfs.DFSClient$BlockReader.newBlockReader(DFSClient.java:1369)
at
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1626)
at
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1743)
at java.io.DataInputStream.readFully(DataInputStream.java:178)
at java.io.DataInputStream.readFully(DataInputStream.java:152)
at
org.apache.hadoop.hbase.io.hfile.HFile$FixedFileTrailer.deserialize(HFile.java:1372)
at
org.apache.hadoop.hbase.io.hfile.HFile$Reader.readTrailer(HFile.java:848)
at
org.apache.hadoop.hbase.io.hfile.HFile$Reader.loadFileInfo(HFile.java:793)
at
org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:273)
at
org.apache.hadoop.hbase.regionserver.StoreFile.<init>(StoreFile.java:129)
at
org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:410)
at org.apache.hadoop.hbase.regionserver.Store.<init>(Store.java:221)
at
org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:1549)
at
org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:312)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.instantiateRegion(HRegionServer.java:1564)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:1531)
at
org.apache.hadoop.hbase.regionserver.HRegionServer$Worker.run(HRegionServer.java:1451)
at java.lang.Thread.run(Thread.java:619)
2010-05-12 19:23:18,246 FATAL
org.apache.hadoop.hbase.regionserver.HRegionServer: OutOfMemoryError, aborting.
java.lang.OutOfMemoryError: Java heap space
at java.io.BufferedInputStream.<init>(BufferedInputStream.java:178)
at
org.apache.hadoop.hdfs.DFSClient$BlockReader.newBlockReader(DFSClient.java:1369)
at
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1626)
at
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1743)
at java.io.DataInputStream.readFully(DataInputStream.java:178)
at java.io.DataInputStream.readFully(DataInputStream.java:152)
at
org.apache.hadoop.hbase.io.hfile.HFile$FixedFileTrailer.deserialize(HFile.java:1372)
at
org.apache.hadoop.hbase.io.hfile.HFile$Reader.readTrailer(HFile.java:848)
at
org.apache.hadoop.hbase.io.hfile.HFile$Reader.loadFileInfo(HFile.java:793)
at
org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:273)
at
org.apache.hadoop.hbase.regionserver.StoreFile.<init>(StoreFile.java:129)
at
org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:410)
at org.apache.hadoop.hbase.regionserver.Store.<init>(Store.java:221)
at
org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:1549)
at
org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:312)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.instantiateRegion(HRegionServer.java:1564)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:1531)
at
org.apache.hadoop.hbase.regionserver.HRegionServer$Worker.run(HRegionServer.java:1451)
at java.lang.Thread.run(Thread.java:619)
2010-05-12 19:23:18,246 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics:
request=0.0, regions=942, stores=9411, storefiles=19887,
storefileIndexSize=182, memstoreSize=0, compactionQueueSize=0, usedHeap=2999,
maxHeap=2999, blockCacheSize=8847696, blockCacheFree=1878235312,
blockCacheCount=1, blockCacheHitRatio=0, fsReadLatency=0, fsWriteLatency=0,
fsSyncLatency=0
2010-05-12 19:23:18,247 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: worker thread exiting
2010-05-12 19:23:18,254 INFO org.apache.hadoop.ipc.HBaseServer: Stopping server
on 60020
2010-05-12 19:23:18,255 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 0 on 60020: exiting
2010-05-12 19:23:18,255 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 1 on 60020: exiting
2010-05-12 19:23:18,255 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 3 on 60020: exiting
2010-05-12 19:23:18,255 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 2 on 60020: exiting
And so on (The region server has a total of 100 handlers)..