When I read sites like https://www.slant.co/versus/959/960/~fluentd_vs_flume <https://www.slant.co/versus/959/960/~fluentd_vs_flume> I get a bit discouraged at how people misunderstand Flume. Even a site like https://www.predictiveanalyticstoday.com/data-ingestion-tools/ <https://www.predictiveanalyticstoday.com/data-ingestion-tools/> is misleading by copying our home page by just saying "Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data” and then copying the image. This leads users to believe that Flume is only useful in a small set of use cases and is intimately tied to Hadoop.
I believe the home page should be changed to indicate say that "Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and streaming large amounts of data”, and then following up to indicate that it is appropriate to use to move any kind of streaming data such as application, audit, or system logs, real time events such as stock quotes, or user transaction records. The second sentence should also be modified to say "It is robust and fault tolerant with tunable reliability mechanisms that can insure guaranteed delivery and many failover and recovery mechanisms”. I also think the very first image should be modified to not show just a web application and HDFS as it seems to give people the impression that Flume is only usable with Hadoop or in web applications. Unfortunately, only the png seems to have been committed so redoing the diagram will mean starting from scratch. Thoughts? Ralph