See below > On Mar 2, 2023, at 3:01 PM, Sean Busbey <sbus...@apple.com.INVALID> wrote: > > Yay! I am very enthusiastic for this progress. Big +1 from me to use this as > a chance to also transition off of jira. > > A couple of questions / concerns: > > a) when / how are we going to add additional repositories? Are we adding one > whenever someone comes with a new source/sink?
I’d say, it depends. For example, flume-search was added for the out-of-date elastic search stuff but I would also add support for Amazon’s Open Search there if we want to support it. So I would say that the guiding principal should be that everything in the repo is related in some way and is small enough that a group of individuals could support it easily. > > Do we want to have a flume-contrib or the like that things could land in > initially? If we do this, can some of these components that are primarily > examples live there? I’m thinking of things like flume-twitter, > flume-morphline, flume-kudu, flume-legacy. I wouldn’t create a flume-contrib as that is just going to be a catch-all for stuff. I also see no relationship between twitter, morphline and legacy. > > b) how broad is the flume-hadoop meant to be? I presume the hbase sink(s) > won’t be staying in flume-core. I would argue for an independent flume-hbase > so we can use the current hbase client libraries without having to worry > about Hadoop specifics. I would add the components that are impacted whenever a Hadoop major release occurs. So I would think that would be Hive,HBase, and HDFS. Possibly kudu but I am not familiar enough with Kudu to know if it makes sense to be bundled in. > > c) I share some of Tristan’s concerns on downstream consumption. Can we add > in a packaging repo that initially provides some kind of minimal downstream > consumable flume to deploy as well as an omnibus deploy that contains > everything like we have today? Well, at the very least we will want to have a BOM pom. I’m certainly open to providing variations of a “packaged and deployable” Flume. > > d) when talking about components that can’t practically be supported, I’d > like to flag up the current twitter source. It’s great for flume demos, but > our current use relies on an API that is dropping out of support. > Additionally, there is the newly looming possibility that there won’t be a > free-for-use tier of the API to test against. I was under the impression that a PR recently got merged to upgrade the twitter depenency. If it still uses the old API then yeah, if fits in that same category. To be clear, my main goals for doing this are a) be able to build and test Flume in a reasonable amount of time. Flume-ng-core is by far the slowest so this change is only going to marginally help with that, and b) Reduce the size of the build so that the CI system can actually run builds and test stuff for us automatically. Right now the CI system is useless. Ralph