Thanks Ash.

Some good points Indeed. I am more and more convinced to SEMVER. I do agree
that consistently following SEMVER has some really nice properties and
makes user decisions easier. CALVER kind of passes the problem to the users
rather than solve it by the maintainers. But the problem remains.

* 100% agree, we should use the "min-airflow version for this provider"
feature of  setiup.py (like we do for cncf.kubernetes) and it is rather
easy to maintain.
* I do agree with the point that having flexibility on deciding on
backwards compatible/incompatible changes is an important one and very
useful long term rather than assuming "full backwards compatibility" for
the entire life of the provider package.
* when I think a bit more about that, I think that it might not be as big
of an effort as it seems - providing that there is some kind of
discipline/automation at the time of commit that can be used at release
time. This way it will be the committer/reviewer responsibility to make
sure that the information is correct, where release manager will only
process that information (automatically). That seems sustainable.
* I am still not sure about the MAJOR version prefix/independence from the
major Airflow version. The Major release of Airflow will be a MAJOR event,
so releasing new packages compatible only with that version seems like an
entirely possible thing to do (we are doing it now with 2.0). Having this
will allow us to continue releasing providers from hypothetical "2_" branch
while we are already releasing "3_" (similarly as we agreed now with
backport packages releasable for 3 months after 2.0 release). Maybe we can
use the same approach as we used for backport packages where we could
effectively put the "major" version of airflow in the package name ('
*apache-airflow-2-providers-google*' for example) to keep the versioning of
each provider package 100% SEMVER? I think that might allow us to make much
deeper "breaking" changes between 2/3 versions of airflow - for example
refactoring the whole interface of BaseOperator. And it will also give the
provider <> airflow version relation that Vikram was talking about.

I updated the AIP-8
<https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-8+Split+Providers+into+Separate+Packages+for+Airflow+2.0>
to reflect that we are undecided about that one:

Versioning proposal is still to be decided closer to the release time (we
> want to come up with a consistent process proposal to handle it). The
> options we considered so far:
>
>    - CALVER with MAJOR AIRFLOW prefix (2.YYYY.MM.DD) for all 2.* Airflow
>    releases. In the future 3.YYYY.MM.DD for all 3.* releases. Backwards
>    incompatibility is only allowed between MAJOR Airflow Versions
>
>
>    - SEMVER separately for each package with a process that allows
>    marking PRs as introducing features/breaking changes to aid automation of
>    release process. Dependency to Airflow version is maintained only by
>    setup.py specification for each package.
>
>
>    - SEMVER for each package but the package name contains major airflow
>    version (apache-airflow-2-provider-google)
>
>
Anyone's input is welcome here :)

J


On Mon, Sep 14, 2020 at 12:48 PM Ash Berlin-Taylor <a...@apache.org> wrote:

>
>
> On Sep 14 2020, at 11:01 am, Jarek Potiuk <jarek.pot...@polidea.com>
> wrote:
>
> >>
> >>
> >> > We have to make sure that we have no dependencies core -> providers
> >>
> >> How do we handle writing logs to S3/GCS/Azure Blob storage, which
> >> depends on the hook from the provider to work?
> >>
> >
> > Good point. I will make sure that I address it in the PR as well. I think
> > those should only work when
> > the corresponding package his installed. And raise a warning otherwise.
> >
>
> An error at startup, I think would be better than an error.
>
>
>
> >>
> >> > Versioning proposal is CALVER with MAJOR AIRFLOW prefix (2.YYYY.MM.DD)
> >> > for all 2.* Airflow releases. In the future 3.YYYY.MM.DD for all 3.*
> >> > releases. Backwards incompatibility is only allowed between MAJOR
> >> > Airflow Versions
> >>
> >> I'd personally prefer use to use SemVar to allow us to make the freedom
> >> to make backwards-incompatible changes to a provider (within the same
> >> major Airflow release). This is what HashiCorp have done with their
> >> terraform providers modules.
> >>
> >
> > This is detail that we can still decide on independently on voting.
> > But in
> > my opinion this
> > one is super difficult (procedurally and regarding involvement of release
> > management time
> > and effort needed) if we want to release each provider separately.
> >
> > But I am open to it as well if you help to define how SEMVER
> implementation
> > should look like and
> > if we come up that does not require a heavy overhead of release
> management
> > that cannot
> > be automated.
> >
> > What is your proposal for the implementation of SEMVER in this case?
> >
> > I can see the following options (but if you have other proposal - I am
> > happy to hear them):
> >
> > a) 2.0.N for all providers released in 2.0. But then according to semver
> > those
> > would be just bugfixes and no new features added) so technically it's not
> > really SEMVER -
> > it's basically the same as 2 + CALVER but with incrementing number
> instead
> > of date.
> >
> > b) 2.0.X.Y incrementing X always when new features are added and Y's when
> > there are bugfixes?
> >
> > c) 2.X.Y for all 2.* providers with X increasing when there are new
> > features and Ys when there are bugfixes.
> >
> > I think, if we go for the b) c) who, when (and based on what criteria)
> will
> > be deciding
> > whether X or Y should be increased for those 60+operators? Would that be
> > contributors committing the code or release manager releasing those
> > packages?
> > Note that the latter literally requires to review those all packages and
> > decide on a
> > case-by-case basis. I would never want to do it personally on regular
> basis.
> >
> > What do you think Ash? Which option do you propose?
>
> I should preface this with my feeling isn't a strong desire, nor
> blocking on the AIP being accepted.
>
> I was thinking just a more strict SemVer (but starting at 2.0.0) -- i.e.
> the providers version is not tied to the Airflow major version, so we
> could have a provider at version 5.2.0 and it still works with Airflow
> -- my thinking here was:
>
> 1. We start at 2.x just for the sake of it -- though it could just as
> easily start every airflow-provider-* at 1.0.0 with the next release
> 2. We use the existing setup.py/python dep process to show limit what
> version of apache-airflow a provider works with.
>
> My main reason for thinking 2) is that I think it's more likely that the
> providers will work with a hypotehtical Airflow 3 unchanged than not,
> and so this avoids having to update/release a new provider version.
>
> After all -- one of the goals of provider is to ease upgrades, so by
> only having to upgrade core, and not upgrade the providers at the same
> time makes it easier on users.
>
> > Note that the latter literally requires to review those all packages and
> > decide on a
> > case-by-case basis.
>
> Yes, this is a bit more work on the release manager, I accept that, but
> we can use some automated process where we build up the changelog
> programatically/from files.
>
> For example, we could either use reno -- for example:
> https://docs.openstack.org/magnum/pike/contributor/reno.html which lets
> each PR specify the type of the change.
>
> Or we could do the same thing but lighter weight -- a bunch of
> Markdown/RST files in airflow/provider/x/changes/ that marks the type,
> and then scripting we write ourselves to build up a CHANGELOG for each
> provider.
>
> As I said above, I would _like_ us (the Airflow project) to do the work
> of deciding SemVer/breaking changes, rather than each of our users
> having to do this. But In this instance I am willing to be persauded
> otherwise.
>
> -ash
>
> >
> > J.
> >
> >
> >
> >> -ash
> >>
> >> On Sep 13 2020, at 9:28 pm, Jarek Potiuk <jarek.pot...@polidea.com>
> wrote:
> >>
> >> > Hello Everyone,
> >> >
> >> > Last week, at the Airflow 2.0 meeting the people involved made a
> decision
> >> > that we are releasing Airlow 2.0 as a set of separate "core" and
> >> > "providers" packages - similarly to the 1.10 "backport providers"
> >> packages.
> >> >
> >> > This decision effectively implements the "spirit" of the AIP-8
> >> > proposed by
> >> > Tim Swast in January 2019. It's not an exact implementation - we've
> >> learned
> >> > a lot since the original proposal and the final implementation is
> >> slightly
> >> > different - rather than "multiple" repositories we stay with the
> >> mono-repo
> >> > approach, but we aim to achieve many of the goals targeted by the
> AIP-8.
> >> >
> >> > I updated the original AIP-8
> >> > <
> >>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=100827303
> >> >
> >> > to reflect those changes.
> >> >
> >> > This is the formal Vote for this proposal. It follows the "vote on
> code
> >> > modification" policy of the ASF:
> >> >
> https://www.apache.org/foundation/voting.html#votes-on-code-modification
> >> >
> >> > We need three +1 votes for the proposal to proceed. All committers
> >> > have a
> >> > binding vote.
> >> >
> >> > The vote will last 72 hours, which means that it ends on Wed 16th Sep
> >> 2020,
> >> > 10:30 PM CEST
> >> >
> >> > Consider this my binding +1.
> >> >
> >> > J
> >> >
> >> > --
> >> > Jarek Potiuk
> >> > Polidea | Principal Software Engineer
> >> >
> >> > M: +48 660 796 129
> >> >
> >>
> >
> >
> > --
> >
> > Jarek Potiuk
> > Polidea <https://www.polidea.com/> | Principal Software Engineer
> >
> > M: +48 660 796 129 <+48660796129>
> > [image: Polidea] <https://www.polidea.com/>
> >
>


-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Reply via email to