> "As a result, we designed and built Flume... > (I wonder if this could deliver into Cassanda :) )
Yes - apparently it's pretty easy to do - I was thinking of doing it but haven't found the time yet. https://issues.cloudera.org//browse/FLUME-20 On Jul 28, 2010, at 4:30 PM, Aaron Morton wrote: > >> If you are looking to store web logs and then do ad hoc queries you >> might/should be using Hadoop (depending on how big your logs are) > > I agree, take a look at the Cloudera Hadopp 3 CDH3, they include an app > called Flume for moving data... > > "As a result, we designed and built Flume. Flume is a distributed service > that makes it very easy to collect and aggregate your data into a persistent > store such as HDFS. Flume can read data from almost any source – log files, > Syslog packets, the standard output of any Unix process – and can deliver it > to a batch processing system like Hadoop or a real-time data store like > HBase. All this can be configured dynamically from a single, central location > – no more tedious configuration file editing and process restarting. Flume > will collect the data from wherever existing applications are storing it, and > whisk it away for further analysis and processing." > > (I wonder if this could deliver into Cassanda :) ) > > If it's straight log file processing Hadoop may be a better fit. > > Aaron