What about a hot standby namenode? For write-ahead-log to avoid crash and recovery, I think this is fine for small I/O. For large volume, the write-ahead-log will actually take up the system IO resource pretty much that makes 2 IO per block (log and the actual data). This will fall back how current database design implements recovery and crash.
Another thing I don't see in the picture is how Hadoop manage system file system instructions. Each system has different implementation on their file system and I believe that by calling 'write' or 'flush' does not really flush the data to the disk. Not sure if this is inevitable and platform OS dependent, but I cannot find any documents to describe how Hadoop handle this. P.S. I handle HA and fail-over mechanism in my own application, but I think for a framwork, it should be transparent (semi-transparent) to the user. -annndy On Fri, Feb 29, 2008 at 1:54 PM, Joydeep Sen Sarma <[EMAIL PROTECTED]> wrote: > I would agree with Ted. You should easily be able to get 100MBps write > throughput on a standard Netapp box (with read bandwidth left over - > since the peak write throughput rating is more than twice of that). Even > at an average write throughput rate of 50MBps - the daily data volume > would be (drumroll ..) 4+TB! > > So buffer to a decent box and copy stuff over .. > > -----Original Message----- > From: Ted Dunning [mailto:[EMAIL PROTECTED] > Sent: Friday, February 29, 2008 11:33 AM > To: core-user@hadoop.apache.org > Subject: Re: long write operations and data recovery > > > Unless your volume is MUCH higher than ours, I think you can get by with > a > relatively small farm of log consolidators that collect and concatenate > files. > > If each log line is 100 bytes after compression (that is huge really) > and > you have 10,000 events per second (also pretty danged high) then you are > only writing 1MB/s. If you need a day of buffering (=100,000 seconds), > then > you need 100GB of buffer storage. These are very, very moderate > requirements for your ingestion point. > > > On 2/29/08 11:18 AM, "Steve Sapovits" <[EMAIL PROTECTED]> wrote: > > > Ted Dunning wrote: > > > >> In our case, we looked at the problem and decided that Hadoop wasn't > >> feasible for our real-time needs in any case. There were several > >> issues, > >> > >> - first, of all, map-reduce itself didn't seem very plausible for > >> real-time applications. That left hbase and hdfs as the capabilities > >> offered by hadoop (for real-time stuff) > > > > We'll be using map-reduce batch mode, so we're okay there. > > > >> The upshot is that we use hadoop extensively for batch operations > >> where it really shines. The other nice effect is that we don't have > >> to worry all that much about HA (at least not real-time HA) since we > >> don't do real-time with hadoop. > > > > What I'm struggling with is the write side of things. We'll have a > huge > > amount of data to write that's essentially a log format. It would > seem > > that writing that outside of HDFS then trying to batch import it would > > be a losing battle -- that you would need the distributed nature of > HDFS > > to do very large volume writes directly and wouldn't easily be able to > take > > some other flat storage model and feed it in as a secondary step > without > > having the HDFS side start to lag behind. > > > > The realization is that Name Node could go down so we'll have to have > a > > backup store that might be used during temporary outages, but that > > most of the writes would be direct HDFS updates. > > > > The alternative would seem to be to end up with a set of distributed > files > > without some unifying distributed file system (e.g., like lots of > Apache > > web logs on many many individual boxes) and then have to come up with > > some way to funnel those back into HDFS. > >