Re: optimizing for random access

Renato Marroquín Mogrovejo Mon, 26 Apr 2010 16:03:20 -0700

Hey Todd, by saying that HDFS is able to read just small byte ranges, are
talking about the capability described in the Bigtable original paper? I
mean the ability to read just part of a compressed SSTable Block and using
it in a block cache type of way.
Thanks.



Renato M.


2010/4/26 Todd Lipcon <t...@cloudera.com>

> On Mon, Apr 26, 2010 at 3:36 PM, Geoff Hendrey <ghend...@decarta.com>
> wrote:
>
> > Let me preface this by saying that you all know much better than I do
> what
> > is best. I'm very impressed by what you've done, and so this isn't
> > criticism. Far from it. It's just curiosity.
> >
> > Memory indexes are "decent", because while they are fast, they don't
> scale.
> > At some point you run out of RAM. Are you implementing an LRU cache?
> Since
> > the table is orders of magnitude larger than the memory available on any
> > region server (even accounting for the fact that a region server needs to
> > cache only its "shard") it's hard to understand how I could support 100%
> > cache hit rate for a TB-sized table and a reasonable number of region
> > servers.
> >
> > When you get a cache miss, and you almost always will when the table is
> > orders of magnitude larger than the cache, you need to read a whole block
> > out of HDFS.
> >
>
> This is a common misconception about HDFS. There's no need to read an
> entire
> HDFS block at a time. Although the blocks may be 64MB+, you can certainly
> read very small byte ranges, and that's exactly what HBase does.
>
> For a more efficient method of accessing local data blocks, I did some
> initial experimentation in HDFS-347, but the speedup was not an order of
> magnitude.
>
> -Todd
>
>
> >
> > My thought with memory mapping was, as you noted, *not* to try to map
> files
> > that are inside of HDFS but rather to copy as many blocks as possible out
> of
> > HDFS, onto region server filesystems, and memory map the file on the
> region
> > server. TB drives are now common. The virtual memory system of the
> Operating
> > System manages paging in and out of "real" memory off disk when you use
> > memory mapping. My experience with memory mapped ByteBuffer in Java is
> that
> > it is very fast and scalable. By fast, I mean I have clocked reads in the
> > microseconds using nanotime. So I was just wondering why you wouldn't at
> > least make a 2nd level cache with memory mapping.
> >
> > -geoff
> >
> > -----Original Message-----
> > From: Ryan Rawson [mailto:ryano...@gmail.com]
> > Sent: Monday, April 26, 2010 1:24 PM
> > To: hbase-user@hadoop.apache.org
> > Subject: Re: optimizing for random access
> >
> > HFile uses in memory indexes to only need 1 seek to access data.  How is
> > this only "decent" ?
> >
> > As for memory mapped files, given that HDFS files are not local, we can't
> > mmap() them.  However HBase does block caching in memory to reduce the
> trips
> > to HDFS.
> >
> > -ryan
> >
> >
> >
> > On Mon, Apr 26, 2010 at 11:33 AM, Geoff Hendrey <ghend...@decarta.com>
> > wrote:
> > > Hi,
> > >
> > > Any pointers on how to optimize hbase for random access? My
> > > understanding is that HFile is decent at random access. Why doesn't it
> > > use memory mapped I/O? (my reading on it indicated it uses "something
> > > like NIO").  I'd like my entire table to be distributed across region
> > > servers, so that random reads are quickly served by a region server
> > > without having to transfer a block from HDFS. Is this the right
> > > approach? I would have thought that some sort of memory-mapped region
> > > file would be perfect for this. Anyway, just looking to understand the
> > > best practice(s).
> > >
> > >
> > > -geoff
> > >
> > >
> > >
> >
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>

Re: optimizing for random access

Reply via email to