RE: Simple OOM crash?

Sandy Pratt Fri, 17 Dec 2010 11:01:59 -0800

Thanks all for your help.

I set about to update the hadoop-lzo jar using Todd Lipcon's git repo 
(https://github.com/toddlipcon/hadoop-lzo), and I encountered an error.  I'm 
not a git user, so I could be doing something wrong, but I'm not sure what.  
Has something changed with this repo in the last month or two?


The error is pasted below:

 [had...@ets-lax-prod-hadoop-01 hadoop-lzo]$ git pull
walk 7cbf6e85ad992faac880ef54a78ce926b6c02bda
walk fdbddcafd8276497d0181d40d72756336d204374
Getting alternates list for http://github.com/toddlipcon/hadoop-lzo.git
Also look at http://github.com/network/312869.git/
error: The requested URL returned error: 502 (curl_result = 22, http_code = 
502, sha1 = 552b3f9cc1c7fd08bedfe029cf76a08e42302ae4)
Getting pack list for http://github.com/toddlipcon/hadoop-lzo.git
Getting pack list for http://github.com/network/312869.git/
error: The requested URL returned error: 502
error: Unable to find 552b3f9cc1c7fd08bedfe029cf76a08e42302ae4 under 
http://github.com/toddlipcon/hadoop-lzo.git
Cannot obtain needed commit 552b3f9cc1c7fd08bedfe029cf76a08e42302ae4
while processing commit fdbddcafd8276497d0181d40d72756336d204374.
fatal: Fetch failed.


Thanks,

Sandy


-----Original Message-----
From: Andrew Purtell [mailto:[email protected]] 
Sent: Thursday, December 16, 2010 17:22
To: [email protected]
Cc: Cosmin Lehene
Subject: RE: Simple OOM crash?

Use hadoop-lzo-0.4.7 or higher from https://github.com/toddlipcon/hadoop-lzo


Best regards,

    - Andy


--- On Thu, 12/16/10, Sandy Pratt <[email protected]> wrote:

> From: Sandy Pratt <[email protected]>
> Subject: RE: Simple OOM crash?
> To: "[email protected]" <[email protected]>
> Cc: "Cosmin Lehene" <[email protected]>
> Date: Thursday, December 16, 2010, 4:00 PM
>
> The LZO jar installed is:
> 
> hadoop-lzo-0.4.6.jar
> 
> The native LZO libs are from EPEL (I think) installed on Centos 5.5 64 
> bit:
> 
> [had...@ets-lax-prod-hadoop-02 Linux-amd64-64]$ yum info lzo-devel 
> Name       : lzo-devel Arch       : x86_64 Version    : 2.02 Release    
> : 2.el5.1 Size       : 144 k Repo       : installed Summary    : 
> Development files for the lzo library URL        : 
> http://www.oberhumer.com/opensource/lzo/
> License    : GPL
> Description: LZO is a portable lossless data compression library 
> written in ANSI C.
>            : It offers
> pretty fast compression and very fast decompression.
>            : This
> package contains development files needed for lzo.
> 
> Is the direct buffer used only with LZO, or is it always involved with 
> HBase read/writes?
> 
> Thanks for the help,
> Sandy
> 
> 
> -----Original Message-----
> From: Ryan Rawson [mailto:[email protected]]
> 
> Sent: Thursday, December 16, 2010 15:50
> To: [email protected]
> Cc: Cosmin Lehene
> Subject: Re: Simple OOM crash?
> 
> What LZO version are you using?  You aren't running out of regular 
> heap, you are running out of "Direct buffer memory" which is capped to 
> prevent mishaps.  There is a flag to increase that size:
> 
> -XX:MaxDirectMemorySize=100m
> 
> etc
> 
> enjoy,
> -ryan
> 
> On Thu, Dec 16, 2010 at 3:07 PM, Sandy Pratt <[email protected]>
> wrote:
> > Hello HBasers,
> >
> > I had a regionserver crash recently, and in perusing
> the logs it looks like it simply had a bit too little memory.  I'm 
> running with 2200 MB heap on reach regionserver.  I plan to shave a 
> bit off the child VM allowance in favor of the regionserver to correct 
> this, probably bringing it up to 2500 MB.  My question is if there is 
> any more specific memory allocation I should make rather than simply 
> giving more to the RS.  I wonder about this because of the following:
> >
> > load=(requests=0, regions=709, usedHeap=1349,
> maxHeap=2198)
> >
> > which suggests to me that there was heap available,
> but the RS couldn't use it for some reason.
> >
> > Conjecture: I do run with LZO compression, so I wonder
> if I could be hitting that memory leak referenced earlier on the list.  
> I know there's a new version of the LZO library available that I 
> should upgrade to, but is it also possible to simply alter the table 
> to gzip compression and do a major compaction, then uninstall LZO once 
> that completes?
> >
> > Log follows:
> >
> > 2010-12-15 20:01:05,239 INFO
> > org.apache.hadoop.hbase.regionserver.HRegion: Starting
> compaction on
> > region
> ets.events,36345112f5654a29b308014f89c108e6,12798158203
> > 11.1063152548
> > 2010-12-15 20:01:05,239 DEBUG
> > org.apache.hadoop.hbase.regionserver.Store: Major
> compaction triggered
> > on store f1; time since last major compaction
> 119928149ms
> > 2010-12-15 20:01:05,240 INFO
> > org.apache.hadoop.hbase.regionserver.Store: Started
> compaction of 2
> > file(s) in f1 of
> ets.events,36345112f5654a29b308014f89c108e6,12
> > 79815820311.1063152548  into
> >
> hdfs://ets-lax-prod-hadoop-01.corp.adobe.com:54310/hbase/ets.events/10
> > 63152548/.tmp, sequenceid=25718885315
> > 2010-12-15 20:01:19,403 WARN
> > org.apache.hadoop.hbase.regionserver.Store: Not in
> >
> setorg.apache.hadoop.hbase.regionserver.storescan...@7466c84
> > 2010-12-15 20:01:19,572 FATAL
> > org.apache.hadoop.hbase.regionserver.HRegionServer:
> Aborting region
> > server
> serverName=ets-lax-prod-hadoop-02.corp.adobe.com,60020,
> > 1289682554219, load=(requests=0, regions=709,
> usedHeap=1349,
> > maxHeap=2198): Uncaught exception in service thread 
> > regionserver60020.compactor
> > java.lang.OutOfMemoryError: Direct buffer memory
> >        at
> java.nio.Bits.reserveMemory(Bits.java:656)
> >        at
> java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:113)
> >        at
> java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:305)
> >        at
> >
> com.hadoop.compression.lzo.LzoCompressor.init(LzoCompressor.java:223)
> >        at
> >
> com.hadoop.compression.lzo.LzoCompressor.reinit(LzoCompressor.java:207
> > )
> >        at
> >
> org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:1
> > 05)
> >        at
> >
> org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:1
> > 12)
> >        at
> >
> org.apache.hadoop.hbase.io.hfile.Compression$Algorithm.getCompressor(C
> > ompression.java:198)
> >        at
> >
> org.apache.hadoop.hbase.io.hfile.HFile$Writer.getCompressingStream(HFi
> > le.java:391)
> >        at
> >
> org.apache.hadoop.hbase.io.hfile.HFile$Writer.newBlock(HFile.java:377)
> >        at
> >
> org.apache.hadoop.hbase.io.hfile.HFile$Writer.checkBlockBoundary(HFile
> > .java:348)
> >        at
> >
> org.apache.hadoop.hbase.io.hfile.HFile$Writer.append(HFile.java:530)
> >        at
> >
> org.apache.hadoop.hbase.io.hfile.HFile$Writer.append(HFile.java:495)
> >        at
> >
> org.apache.hadoop.hbase.regionserver.StoreFile$Writer.append(StoreFile
> > .java:817)
> >        at
> >
> org.apache.hadoop.hbase.regionserver.Store.compact(Store.java:811)
> >        at
> >
> org.apache.hadoop.hbase.regionserver.Store.compact(Store.java:670)
> >        at
> >
> org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.jav
> > a:722)
> >        at
> >
> org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.jav
> > a:671)
> >        at
> >
> org.apache.hadoop.hbase.regionserver.CompactSplitThread.run(CompactSpl
> > itThread.java:84)
> > 2010-12-15 20:01:19,586 INFO
> > org.apache.hadoop.hbase.regionserver.HRegionServer:
> Dump of metrics: 
> > request=0.0, regions=709, stores=709, storefiles=731,
> 
> > storefileIndexSize=418, memstoreSize=33,
> compactionQueueSize=15,
> > usedHeap=856, maxHeap=2198, blockCacheSize=366779472,
> 
> > blockCacheFree=87883088, blockCacheCount=5494,
> blockCacheHitRatio=0
> > 2010-12-15 20:01:20,571 INFO
> org.apache.hadoop.ipc.HBaseServer: 
> > Stopping server on 60020
> >
> > Thanks,
> >
> > Sandy
> >
> >
>

RE: Simple OOM crash?

Reply via email to