Re: realtime hadoop

Daniel Wed, 25 Jun 2008 05:06:44 -0700

2008/6/24 Konstantin Shvachko <[EMAIL PROTECTED]>:

> > Also HDFS might be critical since to access your data you need to close
> the file
>
> Not anymore. Since 0.16 files are readable while being written to.


Does this mean i can open some file as map input and the reduce output ? So
i can update the files instead of creating new ones.
Also if i want to do query in the records, should i rather use Hbase instead
of HDFS? - say if we have large size of data stored as (key, value).

Thanks.

>
>
> >> it as fast as possible. I need to be able to maintain some guaranteed
> >> max. processing time, for example under 3 minutes.
>
> It looks like you do not need very strict guarantees.
> I think you can use hdfs as a data-storage.
> Don't know what kind of data-processing you do, but I agree with Stefan
> that map-reduce is designed for batch tasks rather than for real-time
> processing.
>
>
>
>
> Stefan Groschupf wrote:
>
>> Hadoop might be the wrong technology for you.
>> Map Reduce is a batch processing mechanism. Also HDFS might be critical
>> since to access your data you need to close the file - means you might have
>> many small file, a situation where hdfs is not very strong (namespace is
>> hold in memory).
>> Hbase might be an interesting tool for you, also zookeeper if you want to
>> do something home grown...
>>
>>
>>
>> On Jun 23, 2008, at 11:31 PM, Vadim Zaliva wrote:
>>
>>  Hi!
>>>
>>> I am considering using Hadoop for (almost) realime data processing. I
>>> have data coming every second and I would like to use hadoop cluster
>>> to process
>>> it as fast as possible. I need to be able to maintain some guaranteed
>>> max. processing time, for example under 3 minutes.
>>>
>>> Does anybody have experience with using Hadoop in such manner? I will
>>> appreciate if you can share your experience or give me pointers
>>> to some articles or pages on the subject.
>>>
>>> Vadim
>>>
>>>
>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> 101tec Inc.
>> Menlo Park, California, USA
>> http://www.101tec.com
>>
>>
>>
>>

Re: realtime hadoop

Reply via email to