To duplicate/re-iterate my comments from that PR.: I've just realised that PR will remove the from airflow import DAG which means that (almost) every single dag in existince will need to be updated else it will be 100% broken. I can not stress just how much that is a non-starter for me. Breaking users code with just a "import not found" is a veto from me. Luckily as we are now on Python 3 we can use https://www.python.org/dev/peps/pep-0562/ to have module-level lazy attributes/imports. So we can at least have an (lazy-import) that would issue a deprecation warning so that users who don't read update instructions at least get told what the problem is. (And let's face it, how often have we all blindly upgraded a module or app without fully reading the instructions. It's going to happen.) Also the easier we make the migration the more likely it is to not leave users behind. So if they can run old dags with new Airflow so much the better. Lets take a lesson from Django here, they've thought long and hard about how to do migrations and upgrades. Django's policy can be summed up as: If code works in one version without warnings, it MUST work in the next version but can issue deprecation warnings, and the code can only be removed in the release after that. Jarek's proposed in the issue linked earlier that we write a migration tool which will fix up the code, but that is not my main concern. I feel very strongly that user's code that is currently running and not issuing any warnings should work in the next version. A migration tool can help remove the warnings, but I'm not a fan of breaking every one of our users installs -- it makes updates more work, which makes them less likely to happen! I'm also coming at this from a different point of view than simple migration - but that of asking what the of user facing API and what we should expose to them. Also to me, from airflow import DAG is nicer -- why should the DAG authors have to care about the exact internal implementation of where we've chosen to move this class to now?. Airflow and DAGS go hand in hand -- with PEP-562 we can have lazily imported classes but still present a nice interface to users. And plus we can then re-arrange things later under the hood without having to affect our users. After all, what is the harm if the importing everything is solved? My proposal here: Add in a _getattr_ based lazy import for DAG to airflow/__init__.py module Do NOT issue a deprecation warning for this.
Revert the change to all the imports in example dags etc so that those do from airflow import DAG. (I'm less worried about the internal API airflow users internally, so airflow.models.TaskInstance vs airflow.models.task_instance.TaskInstance is okay for me. I find the later uglier but if it doesn't affect users I don't mind it) Ash On Feb 22 2020, at 7:21 am, Jarek Potiuk <jarek.pot...@polidea.com> wrote: > So.. In the meantime our new contributor Matt implemented the move and we > are merging it https://github.com/apache/airflow/pull/7456 as we have > already several approvals. I hope it will help us to move faster a bit :) > > We might have another discussions on the models __init__.py stuff removal > and plugin mechanism refeactoring but those are other topics and I think we > can discuss it in the PR that will come next. No need to discuss it all here > > J. > On Wed, Feb 19, 2020 at 10:53 AM Kaxil Naik <kaxiln...@gmail.com> wrote: > > +1 - happy with that > > On Wed, Feb 19, 2020 at 8:09 AM Driesprong, Fokko <fo...@driesprong.frl> > > wrote: > > > > > I'm totally fine with that. > > > Cheers, Fokko > > > Op di 18 feb. 2020 om 13:46 schreef Jarek Potiuk < > > jarek.pot...@polidea.com > > > > : > > > > > > > > > > I believe this is one of the cases where we can just go with the > > > consensus > > > > indeed :). > > > > > > > > J. > > > > On Tue, Feb 18, 2020 at 12:51 PM Ash Berlin-Taylor <a...@apache.org> > > > wrote: > > > > > > > > > Do we need to have a vote on it? I'm mostly interested in answering > > the > > > > > question about vote in general terms rather than this specific case) > > > > > > > > > > What do we need votes on, and when is "yeah no one complained, let's > > do > > > > > it" enough? > > > > > > > > > > For example if someone had created a PR for this and had appropriate > > > > > instructions in UPDATING I would feel okay merging it. > > > > > > > > > > -a > > > > > On 18 February 2020 09:37:13 GMT, Jarek Potiuk < > > > jarek.pot...@polidea.com > > > > > > > > > > wrote: > > > > > > All right. I turn it into vote then :) > > > > > > > > > > > > > > > > > > On Tue, Feb 18, 2020 at 7:45 AM Driesprong, Fokko > > > > > > <fo...@driesprong.frl> > > > > > > wrote: > > > > > > > > > > > > > I don't have any objection, however, this isn't a [VOTE] right? ;) > > > > > > > Op di 18 feb. 2020 om 00:08 schreef Jarek Potiuk > > > > > > <jarek.pot...@polidea.com > > > > > > > > : > > > > > > > > > > > > > > > > > > > > > > I see that it's quite welcome change, so I think if no-one else > > > > > > objects > > > > > > > > within three days, I consider that a lazy consensus (not that > > > > > > > > > > > > > > > > > > > > > > > > > > > > > lazy > > > > > > in > > > > > > > fact) > > > > > > > > :) https://www.apache.org/foundation/voting.html > > > > > > > > > > > > > > > > On Mon, Feb 17, 2020 at 6:14 PM Maxime Beauchemin < > > > > > > > > maximebeauche...@gmail.com> wrote: > > > > > > > > > > > > > > > > > +1 > > > > > > > > > On Mon, Feb 17, 2020 at 7:32 AM Daniel Imberman < > > > > > > > > daniel.imber...@gmail.com > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > +1 on my end! > > > > > > > > > > via Newton Mail > > > > > > > > > > [ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cloudmagic.com/k/d/mailapp?ct=dx&cv=10.0.32&pv=10.14.6&source=email_footer_2 > > > > > > > > > > ] > > > > > > > > > > On Mon, Feb 17, 2020 at 12:30 AM, Driesprong, Fokko > > > > > > > > > > > > > > > > > > > > > > > > > <fo...@driesprong.frl > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > I like this as well. It will hopefully also reduce the > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > memory > > > > > > > footprint > > > > > > > > > of > > > > > > > > > > Airflow. > > > > > > > > > > > > > > > > > > > > The only thing I can think of is that it will reduce the > > test > > > > > > > coverage, > > > > > > > > > but > > > > > > > > > > that's a vanity metric anyway :-) > > > > > > > > > > > > > > > > > > > > Cheers, Fokko > > > > > > > > > > Op za 15 feb. 2020 om 13:37 schreef Ash Berlin-Taylor < > > > > > > > a...@apache.org > > > > > > > > > : > > > > > > > > > > > > > > > > > > > > > I'm massively in favour of this. And as a side effect it > > > > > > would > > > > > > > solve > > > > > > > > > an > > > > > > > > > > > issue a reports almost two years ago > > > > > > > > > > > https://issues.apache.org/jira/browse/AIRFLOW-1931 ( > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://issues.apache.org/jira/browse/AIRFLOW-1931?jql=project%20%3D%20AIRFLOW%20AND%20text%20~%20%22logging%20import%22 > > > > > > > > > > > ) > > > > > > > > > > > > > > > > > > > > > > The one outstanding question is how/where we move > > > > > > > > settings.initialize > > > > > > > > > > and > > > > > > > > > > > integrate_plugins to. I'm specifically thinking of > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > usecases > > > > > > > outside > > > > > > > > of > > > > > > > > > > > someone running an airflow subcommand, such as in tests, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > where you > > > > > > > > > want > > > > > > > > > > > airflow to be initialized. > > > > > > > > > > > Perhaps: > > > > > > > > > > > import airflow; airflow.initialize() > > > > > > > > > > > Or I wonder if we need that at all? Things sould maybe > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > integrate > > > > > > > > > plugins > > > > > > > > > > > when they need to (by making a property/method somewhere > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > that is > > > > > > > > > > memoized) > > > > > > > > > > > and likewise in settings? Callers not having to do this > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > would be > > > > > > > > > nicer, > > > > > > > > > > > certainly. > > > > > > > > > > > -a > > > > > > > > > > > On Feb 15 2020, at 12:31 pm, Jarek Potiuk < > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > jarek.pot...@polidea.com > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > TL;DR; I would like to ask the community for opinion > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > about > > > > > > > > reducing > > > > > > > > > > (or > > > > > > > > > > > > even removing) the number of automated imports we have > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > in > > > > > > > > > > > > `airflow/__init__.py` for Airflow 2.0. > > > > > > > > > > > > > > > > > > > > > > > > This issue is plaguing us for quite a while already > > and I > > > > > > think > > > > > > > we > > > > > > > > > > have a > > > > > > > > > > > > perfect opportunity to solve it in AIrflow 2.0. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Currently > > > > > > our > > > > > > > > > > > > `airflow/__init__.py` file contains the code I copied > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > below. > > > > > > > While > > > > > > > > > > > looking > > > > > > > > > > > > fairly innocent it causes a lot of problems - because > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > importing > > > > > > > > > > anything > > > > > > > > > > > > from any airflow package automatically imports probably > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 90% of > > > > > > > the > > > > > > > > > > > airflow > > > > > > > > > > > > internal code - all models, configurations, utils, Task > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Instance, > > > > > > > > > > > > BaseOperator and plenty others (also we initialise all > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > plugins > > > > > > > > where > > > > > > > > > > they > > > > > > > > > > > > are mostly not needed). What it really is - we have > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > implicit > > > > > > > > > > dependencies > > > > > > > > > > > > in our code that are causing various side effects: > > > > > > > > > > > > > > > > > > > > > > > > - pylint detects cyclic dependencies that are > > super-hard > > > > > > and > > > > > > > > > sometimes > > > > > > > > > > > > impossible to remove > > > > > > > > > > > > - mypy and pylint are very slow - mypy parallel more is > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > slowed > > > > > > > > down > > > > > > > > > by > > > > > > > > > > > > having to parse whole airflow in multiple instances, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > and > > > > > > pylint > > > > > > > > > > cannot > > > > > > > > > > be > > > > > > > > > > > > run in parallel at all as it starts behaving randomly > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > w/regards > > > > > > > > > cyclic > > > > > > > > > > > > dependency detections > > > > > > > > > > > > - we cannot really apply pylint and type annotations to > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > most of > > > > > > > > the > > > > > > > > > > core > > > > > > > > > > > > classes as it will add even more cyclic dependencies > > > > > > > > > > > > - last but not least - our CLI is really, really slow > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > because of > > > > > > > > > that > > > > > > > > > > - > > > > > > > > > > > > right now any CLI command even `airflow version` has to > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > pull in > > > > > > > > and > > > > > > > > > > > > initialise all the classes. Solving that slowness is > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > impossible > > > > > > > > > > without > > > > > > > > > > > > removing the __init__.py code > > > > > > > > > > > > > > > > > > > > > > > > The effect of this change is that most of DAGs and > > > plugins > > > > > > > written > > > > > > > > > so > > > > > > > > > > far > > > > > > > > > > > > for 1.10.* will not be compatible with Airflow 2.0 - in > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > all of > > > > > > > the > > > > > > > > > > DAGs > > > > > > > > > > > > import paths will have to be changed. > > > > > > > > > > > > > > > > > > > > > > > > However as I see it - it's not a problem whatsoever. > > > > > > People will > > > > > > > > > have > > > > > > > > > > to > > > > > > > > > > > > perform migration from 1.10.* -> 2.0 and we know it's > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > not > > > > > > going > > > > > > > to > > > > > > > > > be > > > > > > > > > > > > seamless. We are going to write some tools for the > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > migration and > > > > > > > > > > changing > > > > > > > > > > > > such import paths is super easy fix that we can > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > automate > > > > > > > > > super-easily. > > > > > > > > > > > > > > > > > > > > > > > > I'd love to hear community opinion on that. > > > > > > > > > > > > J. > > > > > > > > > > > > > > > > > > > > > > > > *Current `airflow/__init__.py`:* > > > > > > > > > > > > from typing import Callable, Optional > > > > > > > > > > > > from airflow import utils > > > > > > > > > > > > from airflow import settings > > > > > > > > > > > > from airflow import version > > > > > > > > > > > > from airflow.utils.log.logging_mixin import > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > LoggingMixin > > > > > > > > > > > > from airflow.configuration import conf > > > > > > > > > > > > from airflow.exceptions import AirflowException > > > > > > > > > > > > from airflow.models.dag import DAG > > > > > > > > > > > > > > > > > > > > > > > > __version__ = version.version > > > > > > > > > > > > settings.initialize() > > > > > > > > > > > > from airflow.plugins_manager import integrate_plugins > > > > > > > > > > > > login: Optional[Callable] = None > > > > > > > > > > > > integrate_plugins() > > > > > > > > > > > > -- > > > > > > > > > > > > Jarek Potiuk > > > > > > > > > > > > Polidea <https://www.polidea.com/> | Principal > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Software > > > > > > > Engineer > > > > > > > > > > > > > > > > > > > > > > > > M: +48 660 796 129 <+48660796129> > > > > > > > > > > > > [image: Polidea] <https://www.polidea.com/> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Jarek Potiuk > > > > > > > > Polidea <https://www.polidea.com/> | Principal Software > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Engineer > > > > > > > > > > > > > > > > M: +48 660 796 129 <+48660796129> > > > > > > > > [image: Polidea] <https://www.polidea.com/> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Jarek Potiuk > > > > > > Polidea <https://www.polidea.com/> | Principal Software Engineer > > > > > > > > > > > > M: +48 660 796 129 <+48660796129> > > > > > > [image: Polidea] <https://www.polidea.com/> > > > > > > > > > > > > > > > > > > > > > > -- > > > > Jarek Potiuk > > > > Polidea <https://www.polidea.com/> | Principal Software Engineer > > > > > > > > M: +48 660 796 129 <+48660796129> > > > > [image: Polidea] <https://www.polidea.com/> > > > > > > > > > > > > -- > Jarek Potiuk > Polidea <https://www.polidea.com/> | Principal Software Engineer > > M: +48 660 796 129 <+48660796129> > [image: Polidea] <https://www.polidea.com/> >