2008/6/24 Konstantin Shvachko <[EMAIL PROTECTED]>: > > Also HDFS might be critical since to access your data you need to close > the file > > Not anymore. Since 0.16 files are readable while being written to.
Does this mean i can open some file as map input and the reduce output ? So i can update the files instead of creating new ones. Also if i want to do query in the records, should i rather use Hbase instead of HDFS? - say if we have large size of data stored as (key, value). Thanks. > > > >> it as fast as possible. I need to be able to maintain some guaranteed > >> max. processing time, for example under 3 minutes. > > It looks like you do not need very strict guarantees. > I think you can use hdfs as a data-storage. > Don't know what kind of data-processing you do, but I agree with Stefan > that map-reduce is designed for batch tasks rather than for real-time > processing. > > > > > Stefan Groschupf wrote: > >> Hadoop might be the wrong technology for you. >> Map Reduce is a batch processing mechanism. Also HDFS might be critical >> since to access your data you need to close the file - means you might have >> many small file, a situation where hdfs is not very strong (namespace is >> hold in memory). >> Hbase might be an interesting tool for you, also zookeeper if you want to >> do something home grown... >> >> >> >> On Jun 23, 2008, at 11:31 PM, Vadim Zaliva wrote: >> >> Hi! >>> >>> I am considering using Hadoop for (almost) realime data processing. I >>> have data coming every second and I would like to use hadoop cluster >>> to process >>> it as fast as possible. I need to be able to maintain some guaranteed >>> max. processing time, for example under 3 minutes. >>> >>> Does anybody have experience with using Hadoop in such manner? I will >>> appreciate if you can share your experience or give me pointers >>> to some articles or pages on the subject. >>> >>> Vadim >>> >>> >> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> 101tec Inc. >> Menlo Park, California, USA >> http://www.101tec.com >> >> >> >>