When I read sites like https://www.slant.co/versus/959/960/~fluentd_vs_flume 
<https://www.slant.co/versus/959/960/~fluentd_vs_flume> I get a bit discouraged 
at how people misunderstand Flume. Even a site like 
https://www.predictiveanalyticstoday.com/data-ingestion-tools/ 
<https://www.predictiveanalyticstoday.com/data-ingestion-tools/> is misleading 
by copying our home page by just saying "Flume is a distributed, reliable, and 
available service for efficiently collecting, aggregating, and moving large 
amounts of log data” and then copying the image. This leads users to believe 
that Flume is only useful in a small set of use cases and is intimately tied to 
Hadoop. 

I believe the home page should be changed to indicate say that "Flume is a 
distributed, reliable, and available service for efficiently collecting, 
aggregating, and streaming large amounts of data”, and then following up to 
indicate that it is appropriate to use to move any kind of streaming data such 
as application, audit, or system logs, real time events such as stock quotes, 
or user transaction records. 

The second sentence should also be modified to say "It is robust and fault 
tolerant with tunable reliability mechanisms that can insure guaranteed 
delivery and many failover and recovery mechanisms”. 

I also think the very first image should be modified to not show just a web 
application and HDFS as it seems to give people the impression that Flume is 
only usable with Hadoop or in web applications. Unfortunately, only the png 
seems to have been committed so redoing the diagram will mean starting from 
scratch.

Thoughts?

Ralph

Reply via email to