One other issue we haven't talked about is JBOD systems - some people
run over a dozen disks per machine. I run 4 disks/node. HDFS does a
valuable service for HBase in balancing IO across multiple JBODs on a
single node.  Handling disk issues is something we can completely
ignore in HBase right now, it would be nice to keep it that way.

On Mon, Apr 26, 2010 at 4:52 PM, Stack <st...@duboce.net> wrote:
> On Mon, Apr 26, 2010 at 3:36 PM, Geoff Hendrey <ghend...@decarta.com> wrote:
>> My thought with memory mapping was, as you noted, *not* to try to map files 
>> that are inside of HDFS but rather to copy as many blocks as possible out of 
>> HDFS, onto region server filesystems, and memory map the file on the region 
>> server. TB drives are now common. The virtual memory system of the Operating 
>> System manages paging in and out of "real" memory off disk when you use 
>> memory mapping. My experience with memory mapped ByteBuffer in Java is that 
>> it is very fast and scalable. By fast, I mean I have clocked reads in the 
>> microseconds using nanotime. So I was just wondering why you wouldn't at 
>> least make a 2nd level cache with memory mapping.
>>
>
>
> Are memory-mapped files scalable in java?  I'm curious.  Its a while
> since I played with them (circa java 1.5) but then they did not scale.
> I was only able to open a few files concurrently before I started
> running into "interesting" issue.   In hbase I'd need to be able to
> keep hundreds or even thousands open concurrently.
>
> I've thought about doing something like you propose Geoff -- keeping
> some subset of storefiles locally (we could even write two places when
> compacting say, local and out to hdfs) -- but it always devolved
> quickly into a complicated mess keeping up the local copy with the
> remote set making sure the local didn't overflow local storage and
> that local files were aged out on compactions and splits.  If you have
> suggestion on how it'd work, I'd all ears.
>
> Thanks,
> St.Ack
>

Reply via email to