> "As a result, we designed and built Flume...
> (I wonder if this could deliver into Cassanda :) )


Yes - apparently it's pretty easy to do - I was thinking of doing it but 
haven't found the time yet.

https://issues.cloudera.org//browse/FLUME-20

On Jul 28, 2010, at 4:30 PM, Aaron Morton wrote:

> 
>> If you are looking to store web logs and then do ad hoc queries you 
>> might/should be using Hadoop (depending on how big your logs are)
>  
> I agree, take a look at the Cloudera Hadopp 3 CDH3, they include an app 
> called Flume for moving data...
> 
> "As a result, we designed and built Flume. Flume is a distributed service 
> that makes it very easy to collect and aggregate your data into a persistent 
> store such as HDFS. Flume can read data from almost any source – log files, 
> Syslog packets, the standard output of any Unix process – and can deliver it 
> to a batch processing system like Hadoop or a real-time data store like 
> HBase. All this can be configured dynamically from a single, central location 
> – no more tedious configuration file editing and process restarting. Flume 
> will collect the data from wherever existing applications are storing it, and 
> whisk it away for further analysis and processing."
> 
> (I wonder if this could deliver into Cassanda :) )
> 
> If it's straight log file processing Hadoop may be a better fit.
> 
> Aaron

Reply via email to