Re: optimizing for random access

Todd Lipcon Mon, 26 Apr 2010 15:48:06 -0700

On Mon, Apr 26, 2010 at 3:36 PM, Geoff Hendrey <ghend...@decarta.com> wrote:


> Let me preface this by saying that you all know much better than I do what
> is best. I'm very impressed by what you've done, and so this isn't
> criticism. Far from it. It's just curiosity.
>
> Memory indexes are "decent", because while they are fast, they don't scale.
> At some point you run out of RAM. Are you implementing an LRU cache? Since
> the table is orders of magnitude larger than the memory available on any
> region server (even accounting for the fact that a region server needs to
> cache only its "shard") it's hard to understand how I could support 100%
> cache hit rate for a TB-sized table and a reasonable number of region
> servers.
>
> When you get a cache miss, and you almost always will when the table is
> orders of magnitude larger than the cache, you need to read a whole block
> out of HDFS.
>

This is a common misconception about HDFS. There's no need to read an entire
HDFS block at a time. Although the blocks may be 64MB+, you can certainly
read very small byte ranges, and that's exactly what HBase does.

For a more efficient method of accessing local data blocks, I did some
initial experimentation in HDFS-347, but the speedup was not an order of
magnitude.

-Todd


>
> My thought with memory mapping was, as you noted, *not* to try to map files
> that are inside of HDFS but rather to copy as many blocks as possible out of
> HDFS, onto region server filesystems, and memory map the file on the region
> server. TB drives are now common. The virtual memory system of the Operating
> System manages paging in and out of "real" memory off disk when you use
> memory mapping. My experience with memory mapped ByteBuffer in Java is that
> it is very fast and scalable. By fast, I mean I have clocked reads in the
> microseconds using nanotime. So I was just wondering why you wouldn't at
> least make a 2nd level cache with memory mapping.
>
> -geoff
>
> -----Original Message-----
> From: Ryan Rawson [mailto:ryano...@gmail.com]
> Sent: Monday, April 26, 2010 1:24 PM
> To: hbase-user@hadoop.apache.org
> Subject: Re: optimizing for random access
>
> HFile uses in memory indexes to only need 1 seek to access data.  How is
> this only "decent" ?
>
> As for memory mapped files, given that HDFS files are not local, we can't
> mmap() them.  However HBase does block caching in memory to reduce the trips
> to HDFS.
>
> -ryan
>
>
>
> On Mon, Apr 26, 2010 at 11:33 AM, Geoff Hendrey <ghend...@decarta.com>
> wrote:
> > Hi,
> >
> > Any pointers on how to optimize hbase for random access? My
> > understanding is that HFile is decent at random access. Why doesn't it
> > use memory mapped I/O? (my reading on it indicated it uses "something
> > like NIO").  I'd like my entire table to be distributed across region
> > servers, so that random reads are quickly served by a region server
> > without having to transfer a block from HDFS. Is this the right
> > approach? I would have thought that some sort of memory-mapped region
> > file would be perfect for this. Anyway, just looking to understand the
> > best practice(s).
> >
> >
> > -geoff
> >
> >
> >
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Re: optimizing for random access

Reply via email to