Hi, No, in that case my comment can be considered incorrect. The HLog shouldn't fill up very fast - and your problem does sound memory bound now (upper/lower watermark hits).
On Sun, Jun 17, 2012 at 11:49 AM, Infolinks <am...@infolinks.com> wrote: > Hi Harsh J, > > I'm not using WAL in my writes. > Is there still a log rolling ? > > ב-Jun 17, 2012, בשעה 7:40, Harsh J <ha...@cloudera.com> כתב/ה: > >> Amit, >> >> Your values for HLog block size (hbase.regionserver.hlog.blocksize, >> default is the HDFS default block size (64 MB unless you've raised it >> properly), too low unless you also have HLog compression) and the >> factor of max-hlogs-to-keep (hbase.regionserver.maxlogs, default 32 >> files) can easily cause premature flushing as it is another criteria. >> Given your write workload (which hit the WAL), this is definitely what >> you're hitting. >> >> On Sat, Jun 16, 2012 at 7:47 PM, Amit Sela <am...@infolinks.com> wrote: >>> Thanks Doug, I read the regions section from the book like you recommended >>> but I still have some questions left. >>> >>> When running a massive write job, the regionserver log show the memsize >>> that is flushed. The problem is that most of the time the memsize is either >>> much smaller then the memstore.flush.size configured (resulting in writing >>> more files, which leads to frequent compactions) or bigger >>> than memstore.flush.size * memstore.block.multiplier (resulting in Blocking >>> updates for 'IPC Server handler # on <port>...). >>> In some cases I also see HBaseServer throwing a ClosedChannelException: >>> "WARN org.apache.hadoop.ipc.HBaseServer: IPC Server handler <handler #> on >>> <port #> caught: java.nio.channels.ClosedChannelException" >>> >>> I guess these problems are also the cause for long (few minutes) pauses and >>> in extreme cases Full GC during the write jobs. >>> >>> Any ideas anyone ? >>> >>> In general, I did some digging and couldn't find much about the write >>> process in HBase from a "memory usage" point of view... besides the >>> configurations description - maybe worth adding to the book. >>> >>> Thank you for all your help, >>> >>> Amit. >>> >>> >>> On Mon, Jun 11, 2012 at 3:22 PM, Doug Meil >>> <doug.m...@explorysmedical.com>wrote: >>> >>>> >>>> Hi there- >>>> >>>> Your understanding is on track. >>>> >>>> >>>> You probably want to read this section.. >>>> >>>> http://hbase.apache.org/book.html#regions.arch >>>> >>>> Š as it covers those topics in more detail. >>>> >>>> >>>> >>>> >>>> On 6/10/12 1:02 PM, "Amit Sela" <am...@infolinks.com> wrote: >>>> >>>>> Hi all, >>>>> >>>>> I'm trying to better understand what's going on in the region server >>>>> during >>>>> write to HBase. >>>>> >>>>> As I understand the process: >>>>> >>>>> 1. Data is written to memstore. >>>>> 2. Once the memstore has reached hbase.hregion.memstore.flush.size -> >>>>> memstore executes flush and writes a new StoreFile. >>>>> 3. The number of StoreFiles increases until a compaction is triggered. >>>>> >>>>> To my understanding, the compaction is triggered after a compaction check >>>>> is done by either CheckCompaction thread running in the background or by >>>>> the flush memstore executed. >>>>> The compaction triggered will be a minor compaction BUT it could promote >>>>> to >>>>> major if it includes all store files. >>>>> When will it NOT include all store files ? say I set compactionThreshld to >>>>> 3, then when the 3rd (or 4th) flush is executed, a compaction wiil be >>>>> triggered and will promote to major since it includes all store files. >>>>> >>>>> Is this right ? can anyone elaborate ? >>>> >>>> >>>> >> >> >> >> -- >> Harsh J -- Harsh J