My thought was 2.0 should require tasks to be associated with a DAG on creation (maybe needs to be debated), 1.7.1 would **not** require this but start warning in prevision of 2.0
Same with dag inference, we do start doing it in 1.7.1, but warn when we do, though it doesn't matter because of the previous statement, there won't be inference if all tasks are associated. I have to share this analogy about inference I used yesterday. If you get in line, in a line of people that are in line to go see star wars, we can assume you're want to go see star wars. If you tell the next person in line "I'm here because I want to see star wars", the person should just answer "well yeah, that's pretty obvious", not "of course you dummy now get off this line" (which would be the equivalent of raising) Max On Fri, May 6, 2016 at 9:54 AM, Jeremiah Lowin <jlo...@apache.org> wrote: > The second point about moving add_task to a private method makes sense to > me, and I love `set_as_last` and `set_as_first` for convenience. Here are > my thoughts on the other two points: > > > * The absence of dag=dag (or equivalent context manager) WARNS about > deprecation in 2.0 > > - dag has never been a required argument for tasks (pre 1.7.1 or post) so > I'm not sure what behavior we would be deprecating. The only difference is > that 1.7.1 takes additional care when working with dag-less tasks to make > sure they are automatically added to any DAGs they interact with (via > set_upstream, for example). Otherwise you get in situations like the above > where a task in a DAG is allowed to depend on a task outside the DAG, > resulting in a broken DAG. > > > * set_upstream/downstream infers where needed, but WARNS about deprecation > of inference in 2.0 > > - inference is new in 1.7.1; previously, set_upstream could be called on > any two tasks without validating that they were in compatible DAGs (or any > DAG for that matter). I don't think we should put off doing that > inference/validation until 2.0, since the current behavior clearly allows > broken DAGs and should be fixed. To me, this doesn't represent an API > change that needs deprecation (especially since the API is identical), it's > a bug fix that prevents broken DAGs from being created. Maybe we have a > warning message that tells you if validation was done since that piece is a > new feature, but I don't see a reason to wait for 2.0 to do this > validation. The DAG is broken either way; in 1.7.0 you have to run it to > find that out and in 1.7.1 we tell you about it right away. > > > On Fri, May 6, 2016 at 10:53 AM Maxime Beauchemin < > maximebeauche...@gmail.com> wrote: > > > I think that there are currently 4 ways to attach a task to a DAG: > > * Operator(task_id='foo', dag=dag) > > * dag.add_task(t) > > * Context manager: `with DAG('foo') as dag:` > > * inferred from set_upstream() and set_downstream in 1.7.1 (from my > > understanding) > > > > The pattern of passing the DAG object to BaseOperator constructor should > > probably be mandatory as this is how the `default_args` and `params` > magic > > happens and it may be unclear to users that it is the case. I'm assuming > > that the behavior of the context manager is equivalent to the explictit > > dag=dag > > > > Knowing this, set_upstream shouldn't have to infer dag attribution, but > > simply check that the dags on both sides is the same object. > > > > So the changes I'd suggest would be: > > * The absence of dag=dag (or equivalent context manager) WARNS about > > deprecation in 2.0 > > * DAG.add_task becomes private DAG._add_task and now WARNS about > > deprecation in 2.0 on usage, and also warns when reattributing a DAG > twice > > * set_upstream/downstream infers where needed, but WARNS about > deprecation > > of inference in 2.0 > > * Add new methods to replace the upstream(dag.root) pattern, the terms > root > > is unclear (does it mean firsts or lasts?). SE I'd suggest convenience > > methods `set_as_last` and `set_as_first` for clarity > > > > Max > > > > On Fri, May 6, 2016 at 6:32 AM, Bolke de Bruin <bdbr...@gmail.com> > wrote: > > > > > I agree with disabling building broken DAGs and also disabling / not > > > supporting orphaned Operators/TaskInstances. It creates so many issues > > down > > > the line and the fixes are relatively easy. It will make airflow and > the > > > used DAGs more maintainable. > > > > > > So +1, just make sure it is well documented in UPDATING.md > > > > > > my 2 cents. > > > > > > Bolke > > > > > > > Op 6 mei 2016, om 07:22 heeft Jeremiah Lowin <jlo...@apache.org> het > > > volgende geschreven: > > > > > > > > Tonight I was working with Dan on fixing the speed regression with > > large > > > > DAGs: https://github.com/apache/incubator-airflow/pull/1470. That > > clears > > > > the first blocker for 1.7.1 as described in AIRFLOW-52. > > > > > > > > We wanted to ask for the group's thoughts on the second blocker. > > > Basically, > > > > the issue centers on this pattern: > > > > > > > > ```python > > > > # email is an operator WITHOUT a DAG. > > > > email = Operator(...) > > > > > > > > # create dependencies for email > > > > email.set_upstream(dag.roots) > > > > > > > > # add email to the DAG > > > > dag.add_task(email) > > > > ``` > > > > > > > > Why is this a problem? Under Airflow 1.7.0, this DAG is actually > > > completely > > > > broken after the set_upstream command, because it has a dependency > to a > > > > task that's not in the DAG. It can't be run and will even raise an > > > > exception if you do something simple like access dag.roots. HOWEVER, > > this > > > > building this broken DAG is allowed in 1.7.0 and the user cures it in > > the > > > > last line by explicitly adding the task. > > > > > > > > In https://github.com/apache/incubator-airflow/pull/1318, which will > > be > > > > part of 1.7.1, I took steps that prevent users from creating broken > > DAGs > > > at > > > > all. The relevant fix in this case is that the email task would be > > > > automatically added to the DAG (it would infer its membership from > the > > > > tasks in dag.roots). However, once that inference is made, the last > > line > > > > becomes illegal, since you can't add a task to a DAG it's already in. > > > > > > > > So here's the thing: because the last line becomes illegal, this code > > > > snippet will no longer run under 1.7.1. My understanding is that it > is > > > > being used in production at Airbnb, so I wanted to raise the issue to > > see > > > > if we can get comfortable with the change. > > > > > > > > My opinion is that being able to build a broken DAG is *always* a > bug, > > > and > > > > so the issue should be fixed even if that creates some > > incompatibilities > > > > for anyone exploiting it. Particularly in this case, where the remedy > > is > > > > simply to delete the last line. > > > > > > > > We thought about deprecating the behavior, but I don't see how we can > > > > because 1) we don't know for sure that the user is trying to do > > something > > > > illegal at the time of the set_upstream call, and 2) the guard > against > > > > double-adding a task to a DAG has been in Airflow for a very long > time, > > > so > > > > reverting it would constitute a really massive behavior change. > > > > > > > > So my vote is to proceed with the fix, but as it could potentially > > > > inconvenience the hand that feeds (and by "feeds" I mean "gave us > > > Airflow") > > > > I'd like to be sensitive to their needs. > > > > > > > > J > > > > > > > > On Thu, May 5, 2016 at 2:09 PM Dan Davydov > > > <dan.davy...@airbnb.com.invalid> > > > > wrote: > > > > > > > >> Moved discussion to > https://issues.apache.org/jira/browse/AIRFLOW-52 > > > and > > > >> updated the status of the task there. > > > >> > > > >> On Tue, May 3, 2016 at 2:32 AM, Dan Davydov <dan.davy...@airbnb.com > > > > > >> wrote: > > > >> > > > >>> It's per DAG unfortunately (we have some pretty funky DAGs here). > > > >>> On May 2, 2016 10:26 PM, "Bolke de Bruin" <bdbr...@gmail.com> > wrote: > > > >>> > > > >>>> Hi dan > > > >>>> > > > >>>> Is that per dag or per dag bag? Multiprocessing should parallelize > > dag > > > >>>> parsing so I am very curious. Let me know if I can help out. > > > >>>> Bolke > > > >>>> > > > >>>> Sent from my iPhone > > > >>>> > > > >>>>> On 3 mei 2016, at 01:47, Dan Davydov <dan.davy...@airbnb.com > > .INVALID > > > > > > > >>>> wrote: > > > >>>>> > > > >>>>> So a quick update, unfortunately we saw some DAGBag parsing time > > > >>>> increases > > > >>>>> (~10x for some DAGs) on the webservers with the 1.7.1rc3. Because > > of > > > >>>> this I > > > >>>>> will be working on a staging cluster that has a copy of our > > > production > > > >>>>> production DAGBag, and is a copy of our production airflow > > > >>>> infrastructure, > > > >>>>> just without the workers. This will let us debug the release > > outside > > > >> of > > > >>>>> production. > > > >>>>> > > > >>>>> On Thu, Apr 28, 2016 at 10:20 AM, Dan Davydov < > > > dan.davy...@airbnb.com > > > >>> > > > >>>>> wrote: > > > >>>>> > > > >>>>>> Definitely, here were the issues we hit: > > > >>>>>> - airbnb/airflow#1365 occured > > > >>>>>> - Webservers/scheduler were timing out and stuck in restart > cycles > > > >> due > > > >>>> to > > > >>>>>> increased time spent on parsing DAGs due to > > > airbnb/airflow#1213/files > > > >>>>>> - Failed tasks that ran after the upgrade and the revert (after > we > > > >>>>>> reverted the upgrade) were unable to be cleared (but running the > > > >> tasks > > > >>>>>> through the UI worked without clearing them) > > > >>>>>> - The way log files were stored on S3 was changed (airflow now > > > >>>> requires a > > > >>>>>> connection to be setup) which broke log storage > > > >>>>>> - Some DAGs were broken (unable to be parsed) due to package > > > >>>>>> reorganization in open-source (the import paths were changed) > (the > > > >>>> utils > > > >>>>>> refactor commit) > > > >>>>>> > > > >>>>>> On Thu, Apr 28, 2016 at 12:17 AM, Bolke de Bruin < > > bdbr...@gmail.com > > > > > > > >>>>>> wrote: > > > >>>>>> > > > >>>>>>> Dan, > > > >>>>>>> > > > >>>>>>> Are you able to share some of the bugs you have been hitting > and > > > >>>>>>> connected commits? > > > >>>>>>> > > > >>>>>>> We could at the very least learn from them and maybe even > improve > > > >>>> testing. > > > >>>>>>> > > > >>>>>>> Bolke > > > >>>>>>> > > > >>>>>>> > > > >>>>>>>>> Op 28 apr. 2016, om 06:51 heeft Dan Davydov > > > >>>>>>>> <dan.davy...@airbnb.com.INVALID> het volgende geschreven: > > > >>>>>>>> > > > >>>>>>>> All of the blockers were fixed as of yesterday (there was some > > > >> issue > > > >>>>>>> that > > > >>>>>>>> Jeremiah was looking at with the last release candidate which > I > > > >>>> think is > > > >>>>>>>> fixed but I'm not sure). I started staging the airbnb_1.7.1rc3 > > tag > > > >>>>>>> earlier > > > >>>>>>>> today, so as long as metrics look OK and the 1.7.1rc2 issues > > seem > > > >>>>>>> resolved > > > >>>>>>>> tomorrow I will release internally either tomorrow or Monday > (we > > > >> try > > > >>>> to > > > >>>>>>>> avoid releases on Friday). If there aren't any issues we can > > push > > > >> the > > > >>>>>>> 1.7.1 > > > >>>>>>>> tag on Monday/Tuesday. > > > >>>>>>>> > > > >>>>>>>> @Sid > > > >>>>>>>> I think we were originally aiming to deploy internally once > > every > > > >> two > > > >>>>>>> weeks > > > >>>>>>>> but we decided to do it once a month in the end. I'm not too > > sure > > > >>>> about > > > >>>>>>>> that so Max can comment there. > > > >>>>>>>> > > > >>>>>>>> We have been running 1.7.0 in production for about a month now > > and > > > >> it > > > >>>>>>>> stable. > > > >>>>>>>> > > > >>>>>>>> I think what really slowed down this release cycle is some > > commits > > > >>>> that > > > >>>>>>>> caused severe bugs that we decided to roll-forward with > instead > > of > > > >>>>>>> rolling > > > >>>>>>>> back. We can potentially try reverting these commits next time > > > >> while > > > >>>> the > > > >>>>>>>> fixes are applied for the next version, although this is not > > > always > > > >>>>>>> trivial > > > >>>>>>>> to do. > > > >>>>>>>> > > > >>>>>>>> On Wed, Apr 27, 2016 at 9:31 PM, Siddharth Anand < > > > >>>>>>>> siddharthan...@yahoo.com.invalid> wrote: > > > >>>>>>>> > > > >>>>>>>>> Btw, is anyone of the committers running 1.7.0 or later in > any > > > >>>> staging > > > >>>>>>> or > > > >>>>>>>>> production env? I have to say that given that 1.6.2 was the > > most > > > >>>> stable > > > >>>>>>>>> release and is 4 or more months old does not say much for our > > > >>>> release > > > >>>>>>>>> cadence or process. What's our plan for 1.7.1? > > > >>>>>>>>> > > > >>>>>>>>> Sent from Sid's iPhone > > > >>>>>>>>> > > > >>>>>>>>>>> On Apr 27, 2016, at 9:05 PM, Chris Riccomini < > > > >>>> criccom...@apache.org> > > > >>>>>>>>>> wrote: > > > >>>>>>>>>> > > > >>>>>>>>>> Hey all, > > > >>>>>>>>>> > > > >>>>>>>>>> I just wanted to check in on the 1.7.1 release status. I > know > > > >> there > > > >>>>>>> have > > > >>>>>>>>>> been some major-ish bugs, as well as several people doing > > tests. > > > >>>>>>> Should > > > >>>>>>>>> we > > > >>>>>>>>>> create a 1.7.1 release JIRA, and track outstanding issues > > there? > > > >>>>>>>>>> > > > >>>>>>>>>> Cheers, > > > >>>>>>>>>> Chris > > > >>>>>> > > > >>>> > > > >>> > > > >> > > > > > > > > >