Interesting if this is related to what I was seeing -- but to be clear the
error I observed is non-deterministic and doesn't happen every time
(obviously, because otherwise there would be no passing Travis runs). Is
that the case for what you're describing, Dan/Alex?

On Sat, Feb 25, 2017 at 4:13 AM Alex Van Boxel <a...@vanboxel.be> wrote:

About:  Skipped tasks potentially cause a dagrun to be marked
failure/success prematurely. Isn't that related to the discussion I had
with Max about the ONE_SUCCESS trigger? When skipping tasks for now you
need to put ONE_SUCCESS. I had kind of a fix but it was rejected because it
changed behaviour.

On Sat, Feb 25, 2017 at 9:19 AM Bolke de Bruin <bdbr...@gmail.com> wrote:

> Not trying to muddy the waters, but the observation of Jeremiah (non
> deterministic outcomes) might have to do something with #3. I didn’t dive
> in deeper, yet.
>
> ======================================================================
> ERROR: test_backfill_examples (tests.BackfillJobTest)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>   File "/home/travis/build/apache/incubator-airflow/tests/jobs.py", line
> 164, in test_backfill_examples
>     job.run()
>   File "/home/travis/build/apache/incubator-airflow/airflow/jobs.py", line
> 200, in run
>     self._execute()
>   File "/home/travis/build/apache/incubator-airflow/airflow/jobs.py", line
> 1999, in _execute
>     raise AirflowException(err)
> AirflowException: ---------------------------------------------------
> Some task instances failed:
> set([('example_short_circuit_operator', 'condition_is_True',
> datetime.datetime(2016, 1, 1, 0, 0))])
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/204780706/log.txt <
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/204780706/log.txt>
>
> Bolke
>
> > On 25 Feb 2017, at 09:07, Bolke de Bruin <bdbr...@gmail.com> wrote:
> >
> > Hi Dan,
> >
> > - Backfill indeed runs only one dagrun at the time, see line 1755 of
> jobs.py. I’ll think about how to fix this over the weekend (I think it was
> my change that introduced this). Suggestions always welcome. Depending the
> impact it is a blocker or not. We don’t often use backfills and definitely
> not at your size, so that is why it didn’t pop up with us. I’m assuming
> blocker for now, btw.
> > - Speculation on the High DB Load. I’m not sure what your benchmark is
> here (1.7.1 + multi processor dags?), but as you mentioned in the code
> dependencies are checked a couple of times for one run and even task
> instance. Dependency checking requires aggregation on the DB, which is a
> performance killer. Annoying but not a blocker.
> > - Skipped tasks potentially cause a dagrun to be marked failure/success
> prematurely. BranchOperators are widely used if it affects these
operators,
> then it is a blocker.
> >
> > - Bolke
> >
> >> On 25 Feb 2017, at 02:04, Dan Davydov <dan.davy...@airbnb.com.INVALID>
> wrote:
> >>
> >> Update on old pending issues:
> >> - Black Squares in UI: Fix merged
> >> - Double Trigger Issue That Alex G Mentioned: Alex has a PR in flight
> >>
> >> New Issues:
> >> - Backfill seems to be having issues (only running one dagrun at a
> time),
> >> we are still investigating - might be a blocker
> >> - High DB Load (~8x more than 1.7) - We are still investigating but
it's
> >> probably not a blocker for the release
> >> - Skipped tasks potentially cause a dagrun to be marked as
> failure/success
> >> prematurely - not sure whether or not to classify this as a blocker
> (only
> >> really an issue for users who use the BranchingPythonOperator, which
> AirBnB
> >> does)
> >>
> >> On Thu, Feb 23, 2017 at 5:59 PM, siddharth anand <san...@apache.org>
> wrote:
> >>
> >>> IMHO, a DAG run without a start date is non-sensical but is not
> enforced
> >>> That said, our UI allows for the manual creation of DAG Runs without a
> >>> start date as shown in the images below:
> >>>
> >>>
> >>>  - https://www.dropbox.com/s/3sxcqh04eztpl7p/Screenshot%
> >>>  202017-02-22%2016.00.40.png?dl=0
> >>>  <https://www.dropbox.com/s/3sxcqh04eztpl7p/Screenshot%
> >>> 202017-02-22%2016.00.40.png?dl=0>
> >>>  - https://www.dropbox.com/s/4q6rr9dwghag1yy/Screenshot%
> >>>  202017-02-22%2016.02.22.png?dl=0
> >>>  <https://www.dropbox.com/s/4q6rr9dwghag1yy/Screenshot%
> >>> 202017-02-22%2016.02.22.png?dl=0>
> >>>
> >>>
> >>> On Wed, Feb 22, 2017 at 2:26 PM, Maxime Beauchemin <
> >>> maximebeauche...@gmail.com> wrote:
> >>>
> >>>> Our database may have edge cases that could be associated with
running
> >>> any
> >>>> previous version that may or may not have been part of an official
> >>> release.
> >>>>
> >>>> Let's see if anyone else reports the issue. If no one does, one
> option is
> >>>> to release 1.8.0 as is with a comment in the release notes, and have
a
> >>>> future official minor apache release 1.8.1 that would fix these minor
> >>>> issues that are not deal breaker.
> >>>>
> >>>> @bolke, I'm curious, how long does it take you to go through one
> release
> >>>> cycle? Oh, and do you have a documented step by step process for
> >>> releasing?
> >>>> I'd like to add the Pypi part to this doc and add committers that are
> >>>> interested to have rights on the project on Pypi.
> >>>>
> >>>> Max
> >>>>
> >>>> On Wed, Feb 22, 2017 at 2:00 PM, Bolke de Bruin <bdbr...@gmail.com>
> >>> wrote:
> >>>>
> >>>>> So it is a database integrity issue? Afaik a start_date should
always
> >>> be
> >>>>> set for a DagRun (create_dagrun) does so  I didn't check the code
> >>> though.
> >>>>>
> >>>>> Sent from my iPhone
> >>>>>
> >>>>>> On 22 Feb 2017, at 22:19, Dan Davydov <dan.davy...@airbnb.com.
> >>> INVALID>
> >>>>> wrote:
> >>>>>>
> >>>>>> Should clarify this occurs when a dagrun does not have a start
date,
> >>>> not
> >>>>> a
> >>>>>> dag (which makes it even less likely to happen). I don't think this
> >>> is
> >>>> a
> >>>>>> blocker for releasing.
> >>>>>>
> >>>>>>> On Wed, Feb 22, 2017 at 1:15 PM, Dan Davydov <
> >>> dan.davy...@airbnb.com>
> >>>>> wrote:
> >>>>>>>
> >>>>>>> I rolled this out in our prod and the webservers failed to load
due
> >>> to
> >>>>>>> this commit:
> >>>>>>>
> >>>>>>> [AIRFLOW-510] Filter Paused Dags, show Last Run & Trigger Dag
> >>>>>>> 7c94d81c390881643f94d5e3d7d6fb351a445b72
> >>>>>>>
> >>>>>>> This fixed it:
> >>>>>>> -                            </a> <span id="statuses_info"
> >>>>>>> class="glyphicon glyphicon-info-sign" aria-hidden="true"
> >>> title="Start
> >>>>> Date:
> >>>>>>> {{last_run.start_date.strftime('%Y-%m-%d %H:%M')}}"></span>
> >>>>>>> +                            </a> <span id="statuses_info"
> >>>>>>> class="glyphicon glyphicon-info-sign" aria-hidden="true"></span>
> >>>>>>>
> >>>>>>> This is caused by assuming that all DAGs have start dates set, so
a
> >>>>> broken
> >>>>>>> DAG will take down the whole UI. Not sure if we want to make this
a
> >>>>> blocker
> >>>>>>> for the release or not, I'm guessing for most deployments this
> would
> >>>>> occur
> >>>>>>> pretty rarely. I'll submit a PR to fix it soon.
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On Tue, Feb 21, 2017 at 9:49 AM, Chris Riccomini <
> >>>> criccom...@apache.org
> >>>>>>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Ack that the vote has already passed, but belated +1 (binding)
> >>>>>>>>
> >>>>>>>> On Tue, Feb 21, 2017 at 7:42 AM, Bolke de Bruin <
> bdbr...@gmail.com
> >>>>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> IPMC Voting can be found here:
> >>>>>>>>>
> >>>>>>>>> http://mail-archives.apache.org/mod_mbox/incubator-general/
> >>>>>>>> 201702.mbox/%
> >>>>>>>>> 3c676bdc9f-1b55-4469-92a7-9ff309ad0...@gmail.com%3e <
> >>>>>>>>> http://mail-archives.apache.org/mod_mbox/incubator-general/
> >>>>>>>> 201702.mbox/%
> >>>>>>>>> 3c676bdc9f-1b55-4469-92a7-9ff309ad0...@gmail.com%3E>
> >>>>>>>>>
> >>>>>>>>> Kind regards,
> >>>>>>>>> Bolke
> >>>>>>>>>
> >>>>>>>>>> On 21 Feb 2017, at 08:20, Bolke de Bruin <bdbr...@gmail.com>
> >>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>> Hello,
> >>>>>>>>>>
> >>>>>>>>>> Apache Airflow (incubating) 1.8.0 (based on RC4) has been
> >>> accepted.
> >>>>>>>>>>
> >>>>>>>>>> 9 “+1” votes received:
> >>>>>>>>>>
> >>>>>>>>>> - Maxime Beauchemin (binding)
> >>>>>>>>>> - Arthur Wiedmer (binding)
> >>>>>>>>>> - Dan Davydov (binding)
> >>>>>>>>>> - Jeremiah Lowin (binding)
> >>>>>>>>>> - Siddharth Anand (binding)
> >>>>>>>>>> - Alex van Boxel (binding)
> >>>>>>>>>> - Bolke de Bruin (binding)
> >>>>>>>>>>
> >>>>>>>>>> - Jayesh Senjaliya (non-binding)
> >>>>>>>>>> - Yi (non-binding)
> >>>>>>>>>>
> >>>>>>>>>> Vote thread (start):
> >>>>>>>>>> http://mail-archives.apache.org/mod_mbox/incubator-
> >>>>>>>>> airflow-dev/201702.mbox/%3cD360D9BE-C358-42A1-9188-
> >>>>>>>>> 6c92c31a2...@gmail.com%3e <http://mail-archives.apache.
> >>>>>>>>> org/mod_mbox/incubator-airflow-dev/201702.mbox/%3C7EB7B6D6-
> >>>>>>>> 092E-48D2-AA0F-
> >>>>>>>>> 15f44376a...@gmail.com%3E>
> >>>>>>>>>>
> >>>>>>>>>> Next steps:
> >>>>>>>>>> 1) will start the voting process at the IPMC mailinglist. I do
> >>>> expect
> >>>>>>>>> some changes to be required mostly in documentation maybe a
> >>> license
> >>>>> here
> >>>>>>>>> and there. So, we might end up with changes to stable. As long
as
> >>>>> these
> >>>>>>>> are
> >>>>>>>>> not (significant) code changes I will not re-raise the vote.
> >>>>>>>>>> 2) Only after the positive voting on the IPMC and finalisation
I
> >>>> will
> >>>>>>>>> rebrand the RC to Release.
> >>>>>>>>>> 3) I will upload it to the incubator release page, then the tar
> >>>> ball
> >>>>>>>>> needs to propagate to the mirrors.
> >>>>>>>>>> 4) Update the website (can someone volunteer please?)
> >>>>>>>>>> 5) Finally, I will ask Maxime to upload it to pypi. It seems we
> >>> can
> >>>>>>>> keep
> >>>>>>>>> the apache branding as lib cloud is doing this as well (
> >>>>>>>>> https://libcloud.apache.org/downloads.html#pypi-package <
> >>>>>>>>> https://libcloud.apache.org/downloads.html#pypi-package>).
> >>>>>>>>>>
> >>>>>>>>>> Jippie!
> >>>>>>>>>>
> >>>>>>>>>> Bolke
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>
> >
>
> --
  _/
_/ Alex Van Boxel

Reply via email to