I think the skipped task issue mentioned is the same issue described in AIRFLOW-872 <https://issues.apache.org/jira/browse/AIRFLOW-872>. I had a DAG that was consistently hitting it. I believe the skipped task has to have at least one downstream task for this to occur (e.g. LatestOnlyOperator >> DummyOp1 >> DummyOp2), which may also explain why it doesn't affect the tests.
On Sat, Feb 25, 2017 at 9:35 AM, Alex Van Boxel <a...@vanboxel.be> wrote: > I think what I observed was doesn't every time otherwise we would see it > every time. I'll see if it happens this night again. > > On Sat, Feb 25, 2017 at 1:24 PM Jeremiah Lowin <jlo...@apache.org> wrote: > > > Interesting if this is related to what I was seeing -- but to be clear > the > > error I observed is non-deterministic and doesn't happen every time > > (obviously, because otherwise there would be no passing Travis runs). Is > > that the case for what you're describing, Dan/Alex? > > > > On Sat, Feb 25, 2017 at 4:13 AM Alex Van Boxel <a...@vanboxel.be> wrote: > > > > About: Skipped tasks potentially cause a dagrun to be marked > > failure/success prematurely. Isn't that related to the discussion I had > > with Max about the ONE_SUCCESS trigger? When skipping tasks for now you > > need to put ONE_SUCCESS. I had kind of a fix but it was rejected because > it > > changed behaviour. > > > > On Sat, Feb 25, 2017 at 9:19 AM Bolke de Bruin <bdbr...@gmail.com> > wrote: > > > > > Not trying to muddy the waters, but the observation of Jeremiah (non > > > deterministic outcomes) might have to do something with #3. I didn’t > dive > > > in deeper, yet. > > > > > > ====================================================================== > > > ERROR: test_backfill_examples (tests.BackfillJobTest) > > > ---------------------------------------------------------------------- > > > Traceback (most recent call last): > > > File "/home/travis/build/apache/incubator-airflow/tests/jobs.py", > line > > > 164, in test_backfill_examples > > > job.run() > > > File "/home/travis/build/apache/incubator-airflow/airflow/jobs.py", > > line > > > 200, in run > > > self._execute() > > > File "/home/travis/build/apache/incubator-airflow/airflow/jobs.py", > > line > > > 1999, in _execute > > > raise AirflowException(err) > > > AirflowException: --------------------------------------------------- > > > Some task instances failed: > > > set([('example_short_circuit_operator', 'condition_is_True', > > > datetime.datetime(2016, 1, 1, 0, 0))]) > > > https://s3.amazonaws.com/archive.travis-ci.org/jobs/204780706/log.txt > < > > > https://s3.amazonaws.com/archive.travis-ci.org/jobs/204780706/log.txt> > > > > > > Bolke > > > > > > > On 25 Feb 2017, at 09:07, Bolke de Bruin <bdbr...@gmail.com> wrote: > > > > > > > > Hi Dan, > > > > > > > > - Backfill indeed runs only one dagrun at the time, see line 1755 of > > > jobs.py. I’ll think about how to fix this over the weekend (I think it > > was > > > my change that introduced this). Suggestions always welcome. Depending > > the > > > impact it is a blocker or not. We don’t often use backfills and > > definitely > > > not at your size, so that is why it didn’t pop up with us. I’m assuming > > > blocker for now, btw. > > > > - Speculation on the High DB Load. I’m not sure what your benchmark > is > > > here (1.7.1 + multi processor dags?), but as you mentioned in the code > > > dependencies are checked a couple of times for one run and even task > > > instance. Dependency checking requires aggregation on the DB, which is > a > > > performance killer. Annoying but not a blocker. > > > > - Skipped tasks potentially cause a dagrun to be marked > failure/success > > > prematurely. BranchOperators are widely used if it affects these > > operators, > > > then it is a blocker. > > > > > > > > - Bolke > > > > > > > >> On 25 Feb 2017, at 02:04, Dan Davydov <dan.davy...@airbnb.com > > .INVALID> > > > wrote: > > > >> > > > >> Update on old pending issues: > > > >> - Black Squares in UI: Fix merged > > > >> - Double Trigger Issue That Alex G Mentioned: Alex has a PR in > flight > > > >> > > > >> New Issues: > > > >> - Backfill seems to be having issues (only running one dagrun at a > > > time), > > > >> we are still investigating - might be a blocker > > > >> - High DB Load (~8x more than 1.7) - We are still investigating but > > it's > > > >> probably not a blocker for the release > > > >> - Skipped tasks potentially cause a dagrun to be marked as > > > failure/success > > > >> prematurely - not sure whether or not to classify this as a blocker > > > (only > > > >> really an issue for users who use the BranchingPythonOperator, which > > > AirBnB > > > >> does) > > > >> > > > >> On Thu, Feb 23, 2017 at 5:59 PM, siddharth anand <san...@apache.org > > > > > wrote: > > > >> > > > >>> IMHO, a DAG run without a start date is non-sensical but is not > > > enforced > > > >>> That said, our UI allows for the manual creation of DAG Runs > without > > a > > > >>> start date as shown in the images below: > > > >>> > > > >>> > > > >>> - https://www.dropbox.com/s/3sxcqh04eztpl7p/Screenshot% > > > >>> 202017-02-22%2016.00.40.png?dl=0 > > > >>> <https://www.dropbox.com/s/3sxcqh04eztpl7p/Screenshot% > > > >>> 202017-02-22%2016.00.40.png?dl=0> > > > >>> - https://www.dropbox.com/s/4q6rr9dwghag1yy/Screenshot% > > > >>> 202017-02-22%2016.02.22.png?dl=0 > > > >>> <https://www.dropbox.com/s/4q6rr9dwghag1yy/Screenshot% > > > >>> 202017-02-22%2016.02.22.png?dl=0> > > > >>> > > > >>> > > > >>> On Wed, Feb 22, 2017 at 2:26 PM, Maxime Beauchemin < > > > >>> maximebeauche...@gmail.com> wrote: > > > >>> > > > >>>> Our database may have edge cases that could be associated with > > running > > > >>> any > > > >>>> previous version that may or may not have been part of an official > > > >>> release. > > > >>>> > > > >>>> Let's see if anyone else reports the issue. If no one does, one > > > option is > > > >>>> to release 1.8.0 as is with a comment in the release notes, and > have > > a > > > >>>> future official minor apache release 1.8.1 that would fix these > > minor > > > >>>> issues that are not deal breaker. > > > >>>> > > > >>>> @bolke, I'm curious, how long does it take you to go through one > > > release > > > >>>> cycle? Oh, and do you have a documented step by step process for > > > >>> releasing? > > > >>>> I'd like to add the Pypi part to this doc and add committers that > > are > > > >>>> interested to have rights on the project on Pypi. > > > >>>> > > > >>>> Max > > > >>>> > > > >>>> On Wed, Feb 22, 2017 at 2:00 PM, Bolke de Bruin < > bdbr...@gmail.com> > > > >>> wrote: > > > >>>> > > > >>>>> So it is a database integrity issue? Afaik a start_date should > > always > > > >>> be > > > >>>>> set for a DagRun (create_dagrun) does so I didn't check the code > > > >>> though. > > > >>>>> > > > >>>>> Sent from my iPhone > > > >>>>> > > > >>>>>> On 22 Feb 2017, at 22:19, Dan Davydov <dan.davy...@airbnb.com. > > > >>> INVALID> > > > >>>>> wrote: > > > >>>>>> > > > >>>>>> Should clarify this occurs when a dagrun does not have a start > > date, > > > >>>> not > > > >>>>> a > > > >>>>>> dag (which makes it even less likely to happen). I don't think > > this > > > >>> is > > > >>>> a > > > >>>>>> blocker for releasing. > > > >>>>>> > > > >>>>>>> On Wed, Feb 22, 2017 at 1:15 PM, Dan Davydov < > > > >>> dan.davy...@airbnb.com> > > > >>>>> wrote: > > > >>>>>>> > > > >>>>>>> I rolled this out in our prod and the webservers failed to load > > due > > > >>> to > > > >>>>>>> this commit: > > > >>>>>>> > > > >>>>>>> [AIRFLOW-510] Filter Paused Dags, show Last Run & Trigger Dag > > > >>>>>>> 7c94d81c390881643f94d5e3d7d6fb351a445b72 > > > >>>>>>> > > > >>>>>>> This fixed it: > > > >>>>>>> - </a> <span id="statuses_info" > > > >>>>>>> class="glyphicon glyphicon-info-sign" aria-hidden="true" > > > >>> title="Start > > > >>>>> Date: > > > >>>>>>> {{last_run.start_date.strftime('%Y-%m-%d %H:%M')}}"></span> > > > >>>>>>> + </a> <span id="statuses_info" > > > >>>>>>> class="glyphicon glyphicon-info-sign" > aria-hidden="true"></span> > > > >>>>>>> > > > >>>>>>> This is caused by assuming that all DAGs have start dates set, > so > > a > > > >>>>> broken > > > >>>>>>> DAG will take down the whole UI. Not sure if we want to make > this > > a > > > >>>>> blocker > > > >>>>>>> for the release or not, I'm guessing for most deployments this > > > would > > > >>>>> occur > > > >>>>>>> pretty rarely. I'll submit a PR to fix it soon. > > > >>>>>>> > > > >>>>>>> > > > >>>>>>> > > > >>>>>>> On Tue, Feb 21, 2017 at 9:49 AM, Chris Riccomini < > > > >>>> criccom...@apache.org > > > >>>>>> > > > >>>>>>> wrote: > > > >>>>>>> > > > >>>>>>>> Ack that the vote has already passed, but belated +1 (binding) > > > >>>>>>>> > > > >>>>>>>> On Tue, Feb 21, 2017 at 7:42 AM, Bolke de Bruin < > > > bdbr...@gmail.com > > > >>>> > > > >>>>>>>> wrote: > > > >>>>>>>> > > > >>>>>>>>> IPMC Voting can be found here: > > > >>>>>>>>> > > > >>>>>>>>> http://mail-archives.apache.org/mod_mbox/incubator-general/ > > > >>>>>>>> 201702.mbox/% > > > >>>>>>>>> 3c676bdc9f-1b55-4469-92a7-9ff309ad0...@gmail.com%3e < > > > >>>>>>>>> http://mail-archives.apache.org/mod_mbox/incubator-general/ > > > >>>>>>>> 201702.mbox/% > > > >>>>>>>>> 3c676bdc9f-1b55-4469-92a7-9ff309ad0...@gmail.com%3E> > > > >>>>>>>>> > > > >>>>>>>>> Kind regards, > > > >>>>>>>>> Bolke > > > >>>>>>>>> > > > >>>>>>>>>> On 21 Feb 2017, at 08:20, Bolke de Bruin <bdbr...@gmail.com > > > > > >>>> wrote: > > > >>>>>>>>>> > > > >>>>>>>>>> Hello, > > > >>>>>>>>>> > > > >>>>>>>>>> Apache Airflow (incubating) 1.8.0 (based on RC4) has been > > > >>> accepted. > > > >>>>>>>>>> > > > >>>>>>>>>> 9 “+1” votes received: > > > >>>>>>>>>> > > > >>>>>>>>>> - Maxime Beauchemin (binding) > > > >>>>>>>>>> - Arthur Wiedmer (binding) > > > >>>>>>>>>> - Dan Davydov (binding) > > > >>>>>>>>>> - Jeremiah Lowin (binding) > > > >>>>>>>>>> - Siddharth Anand (binding) > > > >>>>>>>>>> - Alex van Boxel (binding) > > > >>>>>>>>>> - Bolke de Bruin (binding) > > > >>>>>>>>>> > > > >>>>>>>>>> - Jayesh Senjaliya (non-binding) > > > >>>>>>>>>> - Yi (non-binding) > > > >>>>>>>>>> > > > >>>>>>>>>> Vote thread (start): > > > >>>>>>>>>> http://mail-archives.apache.org/mod_mbox/incubator- > > > >>>>>>>>> airflow-dev/201702.mbox/%3cD360D9BE-C358-42A1-9188- > > > >>>>>>>>> 6c92c31a2...@gmail.com%3e <http://mail-archives.apache. > > > >>>>>>>>> org/mod_mbox/incubator-airflow-dev/201702.mbox/%3C7EB7B6D6- > > > >>>>>>>> 092E-48D2-AA0F- > > > >>>>>>>>> 15f44376a...@gmail.com%3E> > > > >>>>>>>>>> > > > >>>>>>>>>> Next steps: > > > >>>>>>>>>> 1) will start the voting process at the IPMC mailinglist. I > do > > > >>>> expect > > > >>>>>>>>> some changes to be required mostly in documentation maybe a > > > >>> license > > > >>>>> here > > > >>>>>>>>> and there. So, we might end up with changes to stable. As > long > > as > > > >>>>> these > > > >>>>>>>> are > > > >>>>>>>>> not (significant) code changes I will not re-raise the vote. > > > >>>>>>>>>> 2) Only after the positive voting on the IPMC and > finalisation > > I > > > >>>> will > > > >>>>>>>>> rebrand the RC to Release. > > > >>>>>>>>>> 3) I will upload it to the incubator release page, then the > > tar > > > >>>> ball > > > >>>>>>>>> needs to propagate to the mirrors. > > > >>>>>>>>>> 4) Update the website (can someone volunteer please?) > > > >>>>>>>>>> 5) Finally, I will ask Maxime to upload it to pypi. It seems > > we > > > >>> can > > > >>>>>>>> keep > > > >>>>>>>>> the apache branding as lib cloud is doing this as well ( > > > >>>>>>>>> https://libcloud.apache.org/downloads.html#pypi-package < > > > >>>>>>>>> https://libcloud.apache.org/downloads.html#pypi-package>). > > > >>>>>>>>>> > > > >>>>>>>>>> Jippie! > > > >>>>>>>>>> > > > >>>>>>>>>> Bolke > > > >>>>>>>>> > > > >>>>>>>>> > > > >>>>>>>> > > > >>>>>>> > > > >>>>>>> > > > >>>>> > > > >>>> > > > >>> > > > > > > > > > > -- > > _/ > > _/ Alex Van Boxel > > > -- > _/ > _/ Alex Van Boxel >