Thanks all for your help. I set about to update the hadoop-lzo jar using Todd Lipcon's git repo (https://github.com/toddlipcon/hadoop-lzo), and I encountered an error. I'm not a git user, so I could be doing something wrong, but I'm not sure what. Has something changed with this repo in the last month or two?
The error is pasted below: [had...@ets-lax-prod-hadoop-01 hadoop-lzo]$ git pull walk 7cbf6e85ad992faac880ef54a78ce926b6c02bda walk fdbddcafd8276497d0181d40d72756336d204374 Getting alternates list for http://github.com/toddlipcon/hadoop-lzo.git Also look at http://github.com/network/312869.git/ error: The requested URL returned error: 502 (curl_result = 22, http_code = 502, sha1 = 552b3f9cc1c7fd08bedfe029cf76a08e42302ae4) Getting pack list for http://github.com/toddlipcon/hadoop-lzo.git Getting pack list for http://github.com/network/312869.git/ error: The requested URL returned error: 502 error: Unable to find 552b3f9cc1c7fd08bedfe029cf76a08e42302ae4 under http://github.com/toddlipcon/hadoop-lzo.git Cannot obtain needed commit 552b3f9cc1c7fd08bedfe029cf76a08e42302ae4 while processing commit fdbddcafd8276497d0181d40d72756336d204374. fatal: Fetch failed. Thanks, Sandy -----Original Message----- From: Andrew Purtell [mailto:[email protected]] Sent: Thursday, December 16, 2010 17:22 To: [email protected] Cc: Cosmin Lehene Subject: RE: Simple OOM crash? Use hadoop-lzo-0.4.7 or higher from https://github.com/toddlipcon/hadoop-lzo Best regards, - Andy --- On Thu, 12/16/10, Sandy Pratt <[email protected]> wrote: > From: Sandy Pratt <[email protected]> > Subject: RE: Simple OOM crash? > To: "[email protected]" <[email protected]> > Cc: "Cosmin Lehene" <[email protected]> > Date: Thursday, December 16, 2010, 4:00 PM > > The LZO jar installed is: > > hadoop-lzo-0.4.6.jar > > The native LZO libs are from EPEL (I think) installed on Centos 5.5 64 > bit: > > [had...@ets-lax-prod-hadoop-02 Linux-amd64-64]$ yum info lzo-devel > Name : lzo-devel Arch : x86_64 Version : 2.02 Release > : 2.el5.1 Size : 144 k Repo : installed Summary : > Development files for the lzo library URL : > http://www.oberhumer.com/opensource/lzo/ > License : GPL > Description: LZO is a portable lossless data compression library > written in ANSI C. > : It offers > pretty fast compression and very fast decompression. > : This > package contains development files needed for lzo. > > Is the direct buffer used only with LZO, or is it always involved with > HBase read/writes? > > Thanks for the help, > Sandy > > > -----Original Message----- > From: Ryan Rawson [mailto:[email protected]] > > Sent: Thursday, December 16, 2010 15:50 > To: [email protected] > Cc: Cosmin Lehene > Subject: Re: Simple OOM crash? > > What LZO version are you using? You aren't running out of regular > heap, you are running out of "Direct buffer memory" which is capped to > prevent mishaps. There is a flag to increase that size: > > -XX:MaxDirectMemorySize=100m > > etc > > enjoy, > -ryan > > On Thu, Dec 16, 2010 at 3:07 PM, Sandy Pratt <[email protected]> > wrote: > > Hello HBasers, > > > > I had a regionserver crash recently, and in perusing > the logs it looks like it simply had a bit too little memory. I'm > running with 2200 MB heap on reach regionserver. I plan to shave a > bit off the child VM allowance in favor of the regionserver to correct > this, probably bringing it up to 2500 MB. My question is if there is > any more specific memory allocation I should make rather than simply > giving more to the RS. I wonder about this because of the following: > > > > load=(requests=0, regions=709, usedHeap=1349, > maxHeap=2198) > > > > which suggests to me that there was heap available, > but the RS couldn't use it for some reason. > > > > Conjecture: I do run with LZO compression, so I wonder > if I could be hitting that memory leak referenced earlier on the list. > I know there's a new version of the LZO library available that I > should upgrade to, but is it also possible to simply alter the table > to gzip compression and do a major compaction, then uninstall LZO once > that completes? > > > > Log follows: > > > > 2010-12-15 20:01:05,239 INFO > > org.apache.hadoop.hbase.regionserver.HRegion: Starting > compaction on > > region > ets.events,36345112f5654a29b308014f89c108e6,12798158203 > > 11.1063152548 > > 2010-12-15 20:01:05,239 DEBUG > > org.apache.hadoop.hbase.regionserver.Store: Major > compaction triggered > > on store f1; time since last major compaction > 119928149ms > > 2010-12-15 20:01:05,240 INFO > > org.apache.hadoop.hbase.regionserver.Store: Started > compaction of 2 > > file(s) in f1 of > ets.events,36345112f5654a29b308014f89c108e6,12 > > 79815820311.1063152548 into > > > hdfs://ets-lax-prod-hadoop-01.corp.adobe.com:54310/hbase/ets.events/10 > > 63152548/.tmp, sequenceid=25718885315 > > 2010-12-15 20:01:19,403 WARN > > org.apache.hadoop.hbase.regionserver.Store: Not in > > > setorg.apache.hadoop.hbase.regionserver.storescan...@7466c84 > > 2010-12-15 20:01:19,572 FATAL > > org.apache.hadoop.hbase.regionserver.HRegionServer: > Aborting region > > server > serverName=ets-lax-prod-hadoop-02.corp.adobe.com,60020, > > 1289682554219, load=(requests=0, regions=709, > usedHeap=1349, > > maxHeap=2198): Uncaught exception in service thread > > regionserver60020.compactor > > java.lang.OutOfMemoryError: Direct buffer memory > > at > java.nio.Bits.reserveMemory(Bits.java:656) > > at > java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:113) > > at > java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:305) > > at > > > com.hadoop.compression.lzo.LzoCompressor.init(LzoCompressor.java:223) > > at > > > com.hadoop.compression.lzo.LzoCompressor.reinit(LzoCompressor.java:207 > > ) > > at > > > org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:1 > > 05) > > at > > > org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:1 > > 12) > > at > > > org.apache.hadoop.hbase.io.hfile.Compression$Algorithm.getCompressor(C > > ompression.java:198) > > at > > > org.apache.hadoop.hbase.io.hfile.HFile$Writer.getCompressingStream(HFi > > le.java:391) > > at > > > org.apache.hadoop.hbase.io.hfile.HFile$Writer.newBlock(HFile.java:377) > > at > > > org.apache.hadoop.hbase.io.hfile.HFile$Writer.checkBlockBoundary(HFile > > .java:348) > > at > > > org.apache.hadoop.hbase.io.hfile.HFile$Writer.append(HFile.java:530) > > at > > > org.apache.hadoop.hbase.io.hfile.HFile$Writer.append(HFile.java:495) > > at > > > org.apache.hadoop.hbase.regionserver.StoreFile$Writer.append(StoreFile > > .java:817) > > at > > > org.apache.hadoop.hbase.regionserver.Store.compact(Store.java:811) > > at > > > org.apache.hadoop.hbase.regionserver.Store.compact(Store.java:670) > > at > > > org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.jav > > a:722) > > at > > > org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.jav > > a:671) > > at > > > org.apache.hadoop.hbase.regionserver.CompactSplitThread.run(CompactSpl > > itThread.java:84) > > 2010-12-15 20:01:19,586 INFO > > org.apache.hadoop.hbase.regionserver.HRegionServer: > Dump of metrics: > > request=0.0, regions=709, stores=709, storefiles=731, > > > storefileIndexSize=418, memstoreSize=33, > compactionQueueSize=15, > > usedHeap=856, maxHeap=2198, blockCacheSize=366779472, > > > blockCacheFree=87883088, blockCacheCount=5494, > blockCacheHitRatio=0 > > 2010-12-15 20:01:20,571 INFO > org.apache.hadoop.ipc.HBaseServer: > > Stopping server on 60020 > > > > Thanks, > > > > Sandy > > > > >
