> I assume you mean the released source

I was thinking of a git reference (just like you can do with pip or
npm) so that people can more easily mix and match, but I don't have
strong opinions about this.


Donat

On Mon, Mar 28, 2022 at 7:31 PM Ralph Goers <ralph.go...@dslextreme.com> wrote:
>
> Every release a project does requires a vote and has to meet the ASFs 
> requirements for a release. That said, Apache Maven has seemingly dozens of 
> plugins that are all independently managed and released. If you look at the 
> Maven dev list you will see release votes happening for various things 
> several times a month. But their process used to include something that each 
> release manager updated to track the releases so they could be included in 
> the board report. Now I believe that is all handled by the Apache Reporter 
> service.
>
> I don’t believe our process would be quite that loose. For one, I really 
> don’t consider the way Flume allows new components to be  a true plugin 
> architecture. I would still anticipate we would group releases of things but 
> nothing says it has to be like that.
>
> Downloading the source? I assume you mean the released source. That would be 
> available either by downloading the release zip/tar from the ASF distribution 
> site or by checking out the release tag from git. But I don’t understand why 
> you would do that.
> I use a customized version of Flume but I build it from the flume zip. This 
> is a bit painful as I have to then delete stuff I don’t want or want to 
> override. It would actually be easier for me if I were to reference the 
> various Flume artifacts I need as dependencies and use the dependency plugin 
> to add them to the Flume application I am building.
>
> Ralph
>
> > On Mar 28, 2022, at 9:57 AM, Bessenyei Balázs Donát <bes...@apache.org> 
> > wrote:
> >
> > What does the "module releases" thing look like from an ASF release
> > (process - voting, etc.) perspective?
> >
> > Alternatively, do we want a mechanism to be able to add modules
> > directly from source? (Homebrew-style)
> >
> >
> > Donat
> >
> > On Mon, Mar 28, 2022 at 6:43 PM Ralph Goers <ralph.go...@dslextreme.com> 
> > wrote:
> >>
> >> Thanks for the reply!
> >>
> >> In general I agree with what you are proposing. I’d probably suggest once 
> >> a quarter instead of every 2 months. I also wouldn’t necessarily have a 
> >> release of every component every quarter. If there have been no changes 
> >> there isn’t much of a point. And requiring that everything be released 
> >> together doesn’t really help. I would suggest that Flume would have a 
> >> flume-parent module that includes a parent pom.xml that all projects would 
> >> inherit from. It would include a dependency management section that 
> >> declares the version of dependencies that are used across projects. In 
> >> addition we would want a flume-bom that contains a pom.xml that includes a 
> >> dependency management section declaring all the versions of all components 
> >> for a specific Flume quarterly release.
> >>
> >> As for the versions, I am not sure why you wouldn’t just go with 
> >> 2.0.0-alpha, 2.0.0-beta or 2.0.0-beta1, 2.0.0-beta2 if you aren’t 
> >> comfortable labeling them as GA. Once things are stable you would then 
> >> release 2.0.0.
> >>
> >> Ralph
> >>
> >>> On Mar 28, 2022, at 7:24 AM, Sean Busbey <sbus...@apple.com.INVALID> 
> >>> wrote:
> >>>
> >>> That’s a really interesting possibility.
> >>>
> >>> For the 1.10 release I think we should still upgrade the Hive 1 version 
> >>> to the latest 1.y available, but I agree we’d be well served to get a 
> >>> handle on the increasing set of possible dependencies. A 2.0 release 
> >>> would be a great time to change around how deployment works so that folks 
> >>> don’t expect everything to show up in a single omnibus tarball from a 
> >>> single build as they do now.
> >>>
> >>> There’s a lot of things to take care of making that transition less 
> >>> painful, so I’d suggest we get an overall approach described but try to 
> >>> address it incrementally so we’re not facing a very long delay for 
> >>> further project releases.
> >>>
> >>> How about  something like this?
> >>>
> >>> - Release 1.10.0 soon, only backward compatible releases
> >>> - Release 1.y.0 - every other month, backward compatible dependency 
> >>> updates and bug fixes
> >>> - Release 2.0 alpha - break up project into multiple repos, establish 
> >>> release cadence(s) w/o binary artifacts
> >>> - Release 2.1 beta - have an “easy path” convenience binary
> >>> - Release 2.2 expected to be production ready
> >>>
> >>> For at least those parts of the process that don’t require project svn 
> >>> access I can help with keeping regular 1.y maintenance releases going. We 
> >>> could decide ahead of time on when to stop them; e.g. 6 months after the 
> >>> first “production ready” flume 2.y release.
> >>>
> >>> For the 2.y releases, I think we’re going to have some growing pains in 
> >>> managing how we get from multiple repositories to PMC blessed releases 
> >>> and from there to artifacts someone could use to run flume if they’re 
> >>> used to our current deployment model. Setting expectations via alpha/beta 
> >>> labels and stated packaging goals means we should be able to work out 
> >>> friction points while still walking before we try to run with a long term 
> >>> sustainable path for the project. We could try to put some goal dates on 
> >>> those milestones once we have spent some time discussing details and 
> >>> trying move things forward.
> >>>
> >>>> On Mar 27, 2022, at 4:19 AM, Ralph Goers <ralph.go...@dslextreme.com> 
> >>>> wrote:
> >>>>
> >>>> Sean, (and everyone else)
> >>>>
> >>>> You mentioned that you want to create separate maven modules to upgrade 
> >>>> hive & hbase.  The Flume build is already very large. In addition, 
> >>>> Upgrading to Hive 3 looks like it will require Hadoop 3 while Hive 2 
> >>>> runs with Hadoop 2. This means both dependencies would need to be in the 
> >>>> parent pom. I find this problematic for the following reasons:
> >>>> Flume contains a ton of dependencies and even more transitive 
> >>>> dependencies that are not declared. This makes creating new releases 
> >>>> really hard given how many dependencies have to be checked and upgraded.
> >>>> As more modules are added the build is just going to get slower.
> >>>> Some modules have dependencies on things that are no longer supported. 
> >>>> Again, that makes creating a full Flume release hard.
> >>>>
> >>>> I would suggest that unless security fixes require it we hold off on 
> >>>> creating upgrades in 1.10.0 for HBase and Hive beyond what you have 
> >>>> already done. Instead, we should create new repositories for the parts 
> >>>> of Flume we want to separate and maintain independently. The HBase and 
> >>>> Hive upgrades would end up goring there.
> >>>>
> >>>> I believe this will speed up development since builds will no longer 
> >>>> take so long.It also means that PRs will go against the target repo 
> >>>> which should simplify things. Jira would remain the same as it is today. 
> >>>> The component would be used to identify the target repo.
> >>>>
> >>>> I would suggest that what should remain in the main Flume build would be 
> >>>> primarily, configuration, core, node, sdk, and some of configfilters.  I 
> >>>> would expect we would have separate repos for hbase, hdfs, hive, Kafka, 
> >>>> embedded-agent, tools, and legacy to start.
> >>>>
> >>>> Thoughts?
> >>>>
> >>>> Ralph
> >>>
> >>>
> >>
>

Reply via email to