Let me preface this by saying that you all know much better than I do what is best. I'm very impressed by what you've done, and so this isn't criticism. Far from it. It's just curiosity.
Memory indexes are "decent", because while they are fast, they don't scale. At some point you run out of RAM. Are you implementing an LRU cache? Since the table is orders of magnitude larger than the memory available on any region server (even accounting for the fact that a region server needs to cache only its "shard") it's hard to understand how I could support 100% cache hit rate for a TB-sized table and a reasonable number of region servers. When you get a cache miss, and you almost always will when the table is orders of magnitude larger than the cache, you need to read a whole block out of HDFS. My thought with memory mapping was, as you noted, *not* to try to map files that are inside of HDFS but rather to copy as many blocks as possible out of HDFS, onto region server filesystems, and memory map the file on the region server. TB drives are now common. The virtual memory system of the Operating System manages paging in and out of "real" memory off disk when you use memory mapping. My experience with memory mapped ByteBuffer in Java is that it is very fast and scalable. By fast, I mean I have clocked reads in the microseconds using nanotime. So I was just wondering why you wouldn't at least make a 2nd level cache with memory mapping. -geoff -----Original Message----- From: Ryan Rawson [mailto:ryano...@gmail.com] Sent: Monday, April 26, 2010 1:24 PM To: hbase-user@hadoop.apache.org Subject: Re: optimizing for random access HFile uses in memory indexes to only need 1 seek to access data. How is this only "decent" ? As for memory mapped files, given that HDFS files are not local, we can't mmap() them. However HBase does block caching in memory to reduce the trips to HDFS. -ryan On Mon, Apr 26, 2010 at 11:33 AM, Geoff Hendrey <ghend...@decarta.com> wrote: > Hi, > > Any pointers on how to optimize hbase for random access? My > understanding is that HFile is decent at random access. Why doesn't it > use memory mapped I/O? (my reading on it indicated it uses "something > like NIO"). I'd like my entire table to be distributed across region > servers, so that random reads are quickly served by a region server > without having to transfer a block from HDFS. Is this the right > approach? I would have thought that some sort of memory-mapped region > file would be perfect for this. Anyway, just looking to understand the > best practice(s). > > > -geoff > > >