The Flume build takes forever. It takes about 70 minutes on my 2019 MacBook Pro, which is a pretty beefy machine. In addition, the build continuously fails in the CI builds because it generates too much output and runs out of disk space.
I discussed this previously but I would like to start breaking up Flume into separate, independently releases repos. This should make releases easier. Another point of discussion would be that Flume is currently released as a packaged zip. Personally, I think this is a bad idea as it includes ALL the Flume components whether they are required or not. It makes more sense to me to build Flume as a normal application using Maven dependencies. If you use the new Spring Boot support you will still get all the dependencies packaged in the deployable jar. Even if you don’t like using Spring Boot I believe you can still use the Spring Boot Maven plugin to generate an executable jar. To start this discussion off I would propose immediately creating the following repos. 1. flume-spring-boot 2. flume-kafka 3. flume-jdbc 4. flume-legacy 5. flume-hadoop (would contain hive, hdfs, and hbase stuff) 6. flume-kudu 7. flume-jms 8. flume-twitter 9. flume-thrift. In addition, flume-search was already created and I would like to move the flume-ng-morphline-solr-sink there. For the time being the Elasticsearch module will need to be bypassed until it can be upgraded to a supportable version of ES. Thoughts? Ralph