I have looked at bigtable and it's ssTables etc. But my question is directly related to how it's used with HDFS. HDFS recommends large files, bigger blocks, write once and read many sequential reads. But accessing small rows and writing small rows is more random and different than inherent design of HDFS. How do these 2 go together and is able to provide performance.
On Thu, Jul 7, 2011 at 11:22 AM, Andrew Purtell <[email protected]> wrote: > Hi Mohit, > > Start here: http://labs.google.com/papers/bigtable.html > > Best regards, > > > - Andy > > Problems worthy of attack prove their worth by hitting back. - Piet Hein (via > Tom White) > > >>________________________________ >>From: Mohit Anchlia <[email protected]> >>To: [email protected] >>Sent: Thursday, July 7, 2011 11:12 AM >>Subject: Hbase performance with HDFS >> >>I've been trying to understand how Hbase can provide good performance >>using HDFS when purpose of HDFS is sequential large block sizes which >>is inherently different than of Hbase where it's more random and row >>sizes might be very small. >> >>I am reading this but doesn't answer my question. It does say that >>HFile block size is different but how it really works with HDFS is >>what I am trying to understand. >> >>http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html >> >> >>
