Sean, (and everyone else) You mentioned that you want to create separate maven modules to upgrade hive & hbase. The Flume build is already very large. In addition, Upgrading to Hive 3 looks like it will require Hadoop 3 while Hive 2 runs with Hadoop 2. This means both dependencies would need to be in the parent pom. I find this problematic for the following reasons: Flume contains a ton of dependencies and even more transitive dependencies that are not declared. This makes creating new releases really hard given how many dependencies have to be checked and upgraded. As more modules are added the build is just going to get slower. Some modules have dependencies on things that are no longer supported. Again, that makes creating a full Flume release hard.
I would suggest that unless security fixes require it we hold off on creating upgrades in 1.10.0 for HBase and Hive beyond what you have already done. Instead, we should create new repositories for the parts of Flume we want to separate and maintain independently. The HBase and Hive upgrades would end up goring there. I believe this will speed up development since builds will no longer take so long.It also means that PRs will go against the target repo which should simplify things. Jira would remain the same as it is today. The component would be used to identify the target repo. I would suggest that what should remain in the main Flume build would be primarily, configuration, core, node, sdk, and some of configfilters. I would expect we would have separate repos for hbase, hdfs, hive, Kafka, embedded-agent, tools, and legacy to start. Thoughts? Ralph