*Motivation*

I think we really should start thinking about making it easier to migrate
to 2.0 for our users. After implementing some recent changes related to AIP-21-
Changes in import paths
<https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-21%3A+Changes+in+import+paths>
I
think I have an idea that might help with it.

*Proposal*

We could package some of the new and improved 2.0 operators (moved to
"providers" package) and let them be used in Python 3 environment of
airflow 1.10.x.

This can be done case-by-case per "cloud provider". It should not be
obligatory, should be largely driven by each provider. It's not yet full AIP-8
Split Hooks/Operators into separate packages
<https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=100827303>.
It's
merely backporting of some operators/hooks to get it work in 1.10. But by
doing it we might try out the concept of splitting, learn about maintenance
problems and maybe implement full *AIP-8 *approach in 2.1 consistently
across the board.

*Context*

Part of the AIP-21 was to move import paths for Cloud providers to separate
providers/<PROVIDER> package. An example for that (the first provider we
already almost migrated) was providers/google package (further divided into
gcp/gsuite etc).

We've done a massive migration of all the Google-related operators, created
a few missing ones and retrofitted some old operators to follow GCP best
practices and fixing a number of problems - also implementing Python3 and
Pylint compatibility. Some of these operators/hooks are not backwards
compatible. Those that are compatible are still available via the old
imports with deprecation warning.

We've added missing tests (including system tests) and missing features -
improving some of the Google operators - giving the users more capabilities
and fixing some issues. Those operators should pretty much "just work" in
Airflow 1.10.x (any recent version) for Python 3. We should be able to
release a separate pip-installable package for those operators that users
should be able to install in Airflow 1.10.x.

Any user will be able to install this separate package in their Airflow
1.10.x installation and start using those new "provider" operators in
parallel to the old 1.10.x operators. Other providers ("microsoft",
"amazon") might follow the same approach if they want. We could even at
some point decide to move some of the core operators in similar fashion
(for example following the structure proposed in the latest documentation:
fundamentals / software / etc.
https://airflow.readthedocs.io/en/latest/operators-and-hooks-ref.html)

*Pros and cons*

There are a number of pros:

   - Users will have an easier migration path if they are deeply vested
   into 1.10.* version
   - It's possible to migrate in stages for people who are also vested in
   py2: *py2 (1.10) -> py3 (1.10) -> py3 + new operators (1.10) -> py3 +
   2.0*
   - Moving to new operators in py3 + new operators can be done gradually.
   Old operators will continue to work while new can be used more and more
   - People will get incentivised to migrate to python 3 before 2.0 is out
   (by using new operators)
   - Each provider "package" can have independent release schedule - and
   add functionality in already released Airflow versions.
   - We do not take out any functionality from the users - we just add more
   options
   - The releases can be - similarly as main airflow releases - voted
   separately by PMC after "stewards" of the package (per provider) perform
   round of testing on 1.10.* versions.
   - Users will start migrating to new operators earlier and have smoother
   switch to 2.0 later
   - The latest improved operators will start

There are three cons I could think of:

   - There will be quite a lot of duplication between old and new operators
   (they will co-exist in 1.10). That might lead to confusion of users and
   problems with cooperation between different operators/hooks
   - Having new operators in 1.10 python 3 might keep people from migrating
   to 2.0
   - It will require some maintenance and separate release overhead.

I already spoke to Composer team @Google and they are very positive about
this. I also spoke to Ash and seems it might also be OK for Astronomer
team. We have Google's backing and support, and we can provide maintenance
and support for those packages - being an example for other providers how
they can do it.

Let me know what you think - and whether I should make it into an official
AIP maybe?

J.



-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Reply via email to