Re: [VOTE] Release Airflow 1.8.1 based on Airflow 1.8.1 RC0

2017-04-18 Thread Hitesh Shah
-1.

Not sure if these have been called out earlier.

For all the bundled files with different licenses (MIT, BSD, etc), the full
texts of these licenses should be in the source tarball preferably at the
end of the LICENSE file.
webgl-2d needs to be called out as MIT license.
Version in pkg-info has an rc0 notation. It should just be
1.8.1-incubating.
A bunch of files under apache_airflow.egg-info/ and scripts/systemd/ need a
license header
Likewise for airflow/www/templates/airflow/variables/README.md

Nice to have:
Fix the top-level dir in the tarball to be
"apache-airflow-1.8.1-incubating" instead of
"apache-airflow-1.8.1rc0+apache.incubating"

For all the other binary files (images, gifs), is there source provenance
for all of them and that all of them are covered by the licenses in the
LICENSE file?

Last point - are all the entries in the NOTICE file required or do they
just need to be in the LICENSE file? Any additions to the NOTICE have
downstream repercussions as they need to be propagated down by any other
project using airflow.

thanks
-- Hitesh



On Mon, Apr 17, 2017 at 11:24 AM, Chris Riccomini 
wrote:

> Dear All,
>
> I have been able to make the Airflow 1.8.1 RC0 available at:
> https://dist.apache.org/repos/dist/dev/incubator/airflow, public keys are
> available at https://dist.apache.org/repos/dist/release/incubator/airflow.
>
> Issues fixed:
>
> [AIRFLOW-1062] DagRun#find returns wrong result if external_trigg
> [AIRFLOW-1054] Fix broken import on test_dag
> [AIRFLOW-1050] Retries ignored - regression
> [AIRFLOW-1033] TypeError: can't compare datetime.datetime to None
> [AIRFLOW-1030] HttpHook error when creating HttpSensor
> [AIRFLOW-1017] get_task_instance should return None instead of th
> [AIRFLOW-1011] Fix bug in BackfillJob._execute() for SubDAGs
> [AIRFLOW-1001] Landing Time shows "unsupported operand type(s) fo
> [AIRFLOW-1000] Rebrand to Apache Airflow instead of Airflow
> [AIRFLOW-989] Clear Task Regression
> [AIRFLOW-974] airflow.util.file mkdir has a race condition
> [AIRFLOW-906] Update Code icon from lightning bolt to file
> [AIRFLOW-858] Configurable database name for DB operators
> [AIRFLOW-853] ssh_execute_operator.py stdout decode default to A
> [AIRFLOW-832] Fix debug server
> [AIRFLOW-817] Trigger dag fails when using CLI + API
> [AIRFLOW-816] Make sure to pull nvd3 from local resources
> [AIRFLOW-815] Add previous/next execution dates to available def
> [AIRFLOW-813] Fix unterminated unit tests in tests.job (tests/jo
> [AIRFLOW-812] Scheduler job terminates when there is no dag file
> [AIRFLOW-806] UI should properly ignore DAG doc when it is None
> [AIRFLOW-794] Consistent access to DAGS_FOLDER and SQL_ALCHEMY_C
> [AIRFLOW-785] ImportError if cgroupspy is not installed
> [AIRFLOW-784] Cannot install with funcsigs > 1.0.0
> [AIRFLOW-780] The UI no longer shows broken DAGs
> [AIRFLOW-777] dag_is_running is initlialized to True instead of
> [AIRFLOW-719] Skipped operations make DAG finish prematurely
> [AIRFLOW-694] Empty env vars do not overwrite non-empty config v
> [AIRFLOW-139] Executing VACUUM with PostgresOperator
> [AIRFLOW-111] DAG concurrency is not honored
> [AIRFLOW-88] Improve clarity Travis CI reports
>
> I would like to raise a VOTE for releasing 1.8.1 based on release candidate
> 0, i.e. just renaming release candidate 0 to 1.8.1 release.
>
> Please respond to this email by:
>
> +1,0,-1 with *binding* if you are a PMC member or *non-binding* if you are
> not.
>
> Vote will run for 72 hours (ends this Thursday).
>
> Thanks!
> Chris
>
> My VOTE: +1 (binding)
>


Re: 1.8.0 Backfill Clarification

2017-04-18 Thread Maxime Beauchemin
@Chris this is not the way backfill was designed originally and to me
personally I'd flag the behavior you describe as a bug.

To me, backfill should just "fill in the holes", whether the state came
from a previous backfill run, or the scheduler.

`airflow backfill` was originally designed to be used in conjunction with
`airflow clear` when needed and together they should allow to perform
whatever "surgery" you may have to do. Clear has a lot of options (from
memory) to do date range, task_id regex matching, only_failures,... and so
does backfill. So first you'd issue one or more clear commands to empty the
false positives and [typically] its descendants, or clearing the whole DAG
if you wanted to rerun the whole thing, thus creating the void for backfill
to fill in.

@committers, has that changed?

Max

On Tue, Apr 18, 2017 at 3:53 PM, Paul Zaczkiewicz 
wrote:

> I asked a very similar question last month and got no responses. Note that
> SubDags execute backfill commands in in 1.8.0. The original text of that
> question is as follows:
>
> I've recently upgraded to 1.8.0 and immediately encountered the hanging
> SubDag issue that's been mentioned. I'm not sure the rollback from rc5 to
> rc4 fixed the issue.  For now I've removed all SubDags and put their
> task_instances in the main DAG.
>
> Assuming this issue gets fixed, how is one supposed to recover from
> failures within SubDags after the # of retries have maxed?  Previously, I
> would clear the state of the offending tasks and run a backfill job.
> Backfill jobs in 1.7.1 would skip successful task_instances and only run
> the task_instances with cleared states. Now, backfills and SubDagOperators
> clear the state of successful tasks. I'd rather not re-run a task that
> already succeeded. I tried running backfills with --task_regex and
> --ignore_dependencies, but that doesn't quite work either.
>
> If I have t1(success) -> t2(clear) -> t3(clear) and I set --task_regex so
> that it excludes t1, then t2 will run, but t3 will never run because it
> doesn't wait for t2 to finish. It fails because its upstream dependency
> condition is not met.
>
> I like the logical grouping that SubDags provide, but I don't want all
> retry all tasks even if they're successful. I can see why one would want
> that behavior in some cases, but it's certainly not useful in all.
>
> On Tue, Apr 18, 2017 at 6:45 PM, Chris Fei  wrote:
>
> > Hi all,
> >
> >
> >
> > I'm new to Airflow, and I'm looking for someone to clarify the expected
> > behavior of running a backfill with regard to previously successful
> > tasks. When I run a backfill on 1.8.0, tasks that were previously run
> > successfully are re-run for me. Is it expected that backfills re-run all
> > tasks, even those that were marked as successful? For reference, the
> > command I'm running is `airflow backfill -s 2017-04-01 -e 2017-04-03
> > Tutorial`.
> >
> >
> > I wasn't able to find anything in the documentation to indicate either
> > which way. Some brief research revealed that invoking backfill was meant
> > at one point to "fill in the blanks", which I interpret to mean "only
> > run tasks that were not completed successfully". On the contrary, the
> > code *does* seem to explicitly set all task instances for a given DAGRun
> > to SCHEDULED (see [AIRFLOW-910][1] and
> > https://github.com/apache/incubator-airflow/pull/2107/files#diff-
> > 54a57ccc2c8e73d12c812798bf79ccb2R1816).
> >
> >
> > Apologies for such a fundamental question, just want to make sure I'm
> > not missing something obvious here. Can someone clarify?
> >
> >
> > Thanks,
> >
> > Chris Fei
> >
> >
> > Links:
> >
> >   1. https://issues.apache.org/jira/browse/AIRFLOW-910
> >
>


Re: Best practices on Long running process over LB

2017-04-18 Thread siddharth anand
Another approach :
1. Airflow calls webservice in a fire-and-forget fashion
2. Webservice updates a message bus/stream (e.g. SQS) with result
3. An airfllow sensor pulls updates off SQS and processes them

This saves airflow from polling your webservice which would in turn poll
your DB. Additionally, it avoids coupling your airflow instance to the
availability of your webservice and DB. Also, you'd need to implement an
efficient http endpoint to return status on a potentially long list of
status_ids and then you'd need to manage that list of ids.

SQS is great.  It's cheap to poll (and SQS supports long-polling as well)
and doesn't couple Airflow to the uptime of your webservice and DB. SQS
also supports batch reads and is transactional.

-s

On Tue, Apr 18, 2017 at 3:44 PM, Maxime Beauchemin <
maximebeauche...@gmail.com> wrote:

> The proper way to do this is for your service to return a token (unique
> identifier for the long running process) asynchronously (immediately), and
> to then call another endpoint to check on the status while passing this
> token.
>
> Since this is Airflow and you have the luxury of having a lot of predefined
> sensors, you may just have to call a trigger endpoint async, and in the
> next task have a sensor look for the actual byproduct of that service's
> process (say if the process generates an S3 file, you'd have an S3Sensor
> right after the trigger task). The good thing with this approach is that
> this is more "stateless" than the approach where you are using a token (it
> allows for tasks to die without worrying about the token).
>
> Max
>
> On Tue, Apr 18, 2017 at 2:47 PM, Amit Jain  wrote:
>
> > Hi All,
> >
> > We have a use case where we are building Airflow DAG consisting of few
> > tasks and each task (HttpOperator) is calling the service running behind
> > AWS Elastic Load Balancer (ELB).
> >
> > Since these tasks are the long running process so I'm getting 504 GATEWAY
> > TIMEOUT HTTP status code and resulting into incorrect task status at
> > Airflow side.
> >
> > IMO to solve this problem, we can choose among following approaches
> >
> >- Make a call to the service and service will send back response and
> >process actual request in another thread/process. One monitoring
> thread
> >would heartbeat about task status to DB. At Airflow side, immediate
> task
> >after each HttpOperator, we should have a sensor which should check
> for
> > the
> >status change in given poke interval.
> >- Since we have around 1500 task running per hour so using service
> >discovery system like Apache Zookeeper to get the node in round-robin
> >fashion would make a direct connection with the node running service.
> >- AWS ELB has limitation over HTTP idle-timeout to 1hr and my tasks
> are
> >taking ~ 3 hr to get it done so no change at AWS ELB possible
> >
> >
> > Both approaches have cons first one, makes us change our current flow at
> > each service side i.e. handle a request in async mode, start heartbeat on
> > executing process/thread status in some interval hence the DB writes.
> >
> > I'm interested to know how you guys are handling this problem and any
> > suggestion or improvement in mentioned approaches I can use.
> >
> >
> > Thanks,
> > Amit
> >
>


Re: 1.8.0 Backfill Clarification

2017-04-18 Thread Paul Zaczkiewicz
I asked a very similar question last month and got no responses. Note that
SubDags execute backfill commands in in 1.8.0. The original text of that
question is as follows:

I've recently upgraded to 1.8.0 and immediately encountered the hanging
SubDag issue that's been mentioned. I'm not sure the rollback from rc5 to
rc4 fixed the issue.  For now I've removed all SubDags and put their
task_instances in the main DAG.

Assuming this issue gets fixed, how is one supposed to recover from
failures within SubDags after the # of retries have maxed?  Previously, I
would clear the state of the offending tasks and run a backfill job.
Backfill jobs in 1.7.1 would skip successful task_instances and only run
the task_instances with cleared states. Now, backfills and SubDagOperators
clear the state of successful tasks. I'd rather not re-run a task that
already succeeded. I tried running backfills with --task_regex and
--ignore_dependencies, but that doesn't quite work either.

If I have t1(success) -> t2(clear) -> t3(clear) and I set --task_regex so
that it excludes t1, then t2 will run, but t3 will never run because it
doesn't wait for t2 to finish. It fails because its upstream dependency
condition is not met.

I like the logical grouping that SubDags provide, but I don't want all
retry all tasks even if they're successful. I can see why one would want
that behavior in some cases, but it's certainly not useful in all.

On Tue, Apr 18, 2017 at 6:45 PM, Chris Fei  wrote:

> Hi all,
>
>
>
> I'm new to Airflow, and I'm looking for someone to clarify the expected
> behavior of running a backfill with regard to previously successful
> tasks. When I run a backfill on 1.8.0, tasks that were previously run
> successfully are re-run for me. Is it expected that backfills re-run all
> tasks, even those that were marked as successful? For reference, the
> command I'm running is `airflow backfill -s 2017-04-01 -e 2017-04-03
> Tutorial`.
>
>
> I wasn't able to find anything in the documentation to indicate either
> which way. Some brief research revealed that invoking backfill was meant
> at one point to "fill in the blanks", which I interpret to mean "only
> run tasks that were not completed successfully". On the contrary, the
> code *does* seem to explicitly set all task instances for a given DAGRun
> to SCHEDULED (see [AIRFLOW-910][1] and
> https://github.com/apache/incubator-airflow/pull/2107/files#diff-
> 54a57ccc2c8e73d12c812798bf79ccb2R1816).
>
>
> Apologies for such a fundamental question, just want to make sure I'm
> not missing something obvious here. Can someone clarify?
>
>
> Thanks,
>
> Chris Fei
>
>
> Links:
>
>   1. https://issues.apache.org/jira/browse/AIRFLOW-910
>


Best practices on Long running process over LB

2017-04-18 Thread Amit Jain
Hi All,

We have a use case where we are building Airflow DAG consisting of few
tasks and each task (HttpOperator) is calling the service running behind
AWS Elastic Load Balancer (ELB).

Since these tasks are the long running process so I'm getting 504 GATEWAY
TIMEOUT HTTP status code and resulting into incorrect task status at
Airflow side.

IMO to solve this problem, we can choose among following approaches

   - Make a call to the service and service will send back response and
   process actual request in another thread/process. One monitoring thread
   would heartbeat about task status to DB. At Airflow side, immediate task
   after each HttpOperator, we should have a sensor which should check for the
   status change in given poke interval.
   - Since we have around 1500 task running per hour so using service
   discovery system like Apache Zookeeper to get the node in round-robin
   fashion would make a direct connection with the node running service.
   - AWS ELB has limitation over HTTP idle-timeout to 1hr and my tasks are
   taking ~ 3 hr to get it done so no change at AWS ELB possible


Both approaches have cons first one, makes us change our current flow at
each service side i.e. handle a request in async mode, start heartbeat on
executing process/thread status in some interval hence the DB writes.

I'm interested to know how you guys are handling this problem and any
suggestion or improvement in mentioned approaches I can use.


Thanks,
Amit


Re: [VOTE] Release Airflow 1.8.1 based on Airflow 1.8.1 RC0

2017-04-18 Thread siddharth anand
https://issues.apache.org/jira/browse/AIRFLOW-1121

Jira filed.

On Tue, Apr 18, 2017 at 1:27 PM, siddharth anand  wrote:

> Sure. As soon as I get out of my meetings.
>
> -s
>
> On Tue, Apr 18, 2017 at 1:01 PM Chris Riccomini 
> wrote:
>
>> @Sid, can you open JIRA(s), and assign them as blockers to 1.8.1?
>>
>> On Tue, Apr 18, 2017 at 12:39 PM, siddharth anand 
>> wrote:
>>
>> > I've run into a regression with the webserver. It looks like the --pid
>> > argument is no longer honored in 1.8.1. The pid file is not being
>> written
>> > out! As a result, monitd, which watches the processes mentioned in the
>> pid
>> > file, keep trying to spawn webservers.
>> >
>> > HISTTIMEFORMAT="%d/%m/%y %T "
>> > PYTHONPATH=/usr/local/agari/ep-pipeline/production/
>> > current/analysis/cluster/:/usr/local/agari/ep-pipeline/
>> > production/current/analysis/lookups/
>> > TMP=/data/tmp AIRFLOW_HOME=/data/airflow
>> > PATH=/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin airflow webserver -p
>> > 8080  --pid /data/airflow/pids/airflow-webserver.pid
>> >
>> > The "upgrade" process for 1.8.1. is not simply  "pip install 1.8.1.
>> > tarball". It requires a "pip uninstall" of the previous 1.8.0 version
>> > followed by an new installation of 1.8.1. This could have pulled in some
>> > new dependencies that broke how this works.
>> >
>> > On Tue, Apr 18, 2017 at 12:32 PM, siddharth anand 
>> > wrote:
>> >
>> > > Hmn.. it always worked for me for any of the releases we installed. I
>> > > install `pip install `
>> > >
>> > > -s
>> > >
>> > > On Tue, Apr 18, 2017 at 10:44 AM, Chris Riccomini <
>> criccom...@apache.org
>> > >
>> > > wrote:
>> > >
>> > >> @Sid, how do you enable the versioning? I've never been able to get
>> this
>> > >> to
>> > >> work in my environment. It always shows "Not available", even with
>> > 1.8.0.
>> > >>
>> > >> On Mon, Apr 17, 2017 at 11:18 PM, Bolke de Bruin 
>> > >> wrote:
>> > >>
>> > >> > Hey Alex,
>> > >> >
>> > >> > I agree with you that they are nice to have, but as you mentioned
>> they
>> > >> are
>> > >> > not blockers. As we are moving towards time based releases I
>> suggest
>> > >> > marking them for 1.8.2 and cherry-picking them in your production.
>> > >> >
>> > >> > - Bolke.
>> > >> >
>> > >> > > On 18 Apr 2017, at 00:02, Alex Guziel
>> > > >> D>
>> > >> > wrote:
>> > >> > >
>> > >> > > Sorry about that. FWIW, these were recent and I don't think they
>> > were
>> > >> > > blockers but are nice to fix. Particularly, the tree one was
>> > forgotten
>> > >> > > about. I remember seeing it at the Airflow hackathon but I guess
>> I
>> > >> forgot
>> > >> > > to correct it.
>> > >> > >
>> > >> > > On Mon, Apr 17, 2017 at 12:17 PM, Chris Riccomini <
>> > >> criccom...@apache.org
>> > >> > >
>> > >> > > wrote:
>> > >> > >
>> > >> > >> :(:(:( Why was this not included in 1.8.1 JIRA? I've been
>> emailing
>> > >> the
>> > >> > list
>> > >> > >> all last week
>> > >> > >>
>> > >> > >> On Mon, Apr 17, 2017 at 11:28 AM, Alex Guziel <
>> > >> > >> alex.guz...@airbnb.com.invalid> wrote:
>> > >> > >>
>> > >> > >>> I would say to include [1074] (
>> > >> > >>> https://github.com/apache/incubator-airflow/pull/2221) so we
>> > don't
>> > >> > have
>> > >> > >> a
>> > >> > >>> regression in the release after. I would also say
>> > >> > >>> https://github.com/apache/incubator-airflow/pull/2241 is semi
>> > >> > important
>> > >> > >>> but
>> > >> > >>> less so.
>> > >> > >>>
>> > >> > >>> On Mon, Apr 17, 2017 at 11:24 AM, Chris Riccomini <
>> > >> > criccom...@apache.org
>> > >> > >>>
>> > >> > >>> wrote:
>> > >> > >>>
>> > >> >  Dear All,
>> > >> > 
>> > >> >  I have been able to make the Airflow 1.8.1 RC0 available at:
>> > >> >  https://dist.apache.org/repos/dist/dev/incubator/airflow,
>> public
>> > >> keys
>> > >> > >>> are
>> > >> >  available at https://dist.apache.org/repos/
>> > >> > >>> dist/release/incubator/airflow.
>> > >> > 
>> > >> >  Issues fixed:
>> > >> > 
>> > >> >  [AIRFLOW-1062] DagRun#find returns wrong result if
>> external_trigg
>> > >> >  [AIRFLOW-1054] Fix broken import on test_dag
>> > >> >  [AIRFLOW-1050] Retries ignored - regression
>> > >> >  [AIRFLOW-1033] TypeError: can't compare datetime.datetime to
>> None
>> > >> >  [AIRFLOW-1030] HttpHook error when creating HttpSensor
>> > >> >  [AIRFLOW-1017] get_task_instance should return None instead
>> of th
>> > >> >  [AIRFLOW-1011] Fix bug in BackfillJob._execute() for SubDAGs
>> > >> >  [AIRFLOW-1001] Landing Time shows "unsupported operand
>> type(s) fo
>> > >> >  [AIRFLOW-1000] Rebrand to Apache Airflow instead of Airflow
>> > >> >  [AIRFLOW-989] Clear Task Regression
>> > >> >  [AIRFLOW-974] airflow.util.file mkdir has a race condition
>> > >> >  [AIRFLOW-906] Update Code icon from lightning bolt to file
>> > >> >  

Re: [VOTE] Release Airflow 1.8.1 based on Airflow 1.8.1 RC0

2017-04-18 Thread siddharth anand
Sure. As soon as I get out of my meetings.

-s

On Tue, Apr 18, 2017 at 1:01 PM Chris Riccomini 
wrote:

> @Sid, can you open JIRA(s), and assign them as blockers to 1.8.1?
>
> On Tue, Apr 18, 2017 at 12:39 PM, siddharth anand 
> wrote:
>
> > I've run into a regression with the webserver. It looks like the --pid
> > argument is no longer honored in 1.8.1. The pid file is not being written
> > out! As a result, monitd, which watches the processes mentioned in the
> pid
> > file, keep trying to spawn webservers.
> >
> > HISTTIMEFORMAT="%d/%m/%y %T "
> > PYTHONPATH=/usr/local/agari/ep-pipeline/production/
> > current/analysis/cluster/:/usr/local/agari/ep-pipeline/
> > production/current/analysis/lookups/
> > TMP=/data/tmp AIRFLOW_HOME=/data/airflow
> > PATH=/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin airflow webserver -p
> > 8080  --pid /data/airflow/pids/airflow-webserver.pid
> >
> > The "upgrade" process for 1.8.1. is not simply  "pip install 1.8.1.
> > tarball". It requires a "pip uninstall" of the previous 1.8.0 version
> > followed by an new installation of 1.8.1. This could have pulled in some
> > new dependencies that broke how this works.
> >
> > On Tue, Apr 18, 2017 at 12:32 PM, siddharth anand 
> > wrote:
> >
> > > Hmn.. it always worked for me for any of the releases we installed. I
> > > install `pip install `
> > >
> > > -s
> > >
> > > On Tue, Apr 18, 2017 at 10:44 AM, Chris Riccomini <
> criccom...@apache.org
> > >
> > > wrote:
> > >
> > >> @Sid, how do you enable the versioning? I've never been able to get
> this
> > >> to
> > >> work in my environment. It always shows "Not available", even with
> > 1.8.0.
> > >>
> > >> On Mon, Apr 17, 2017 at 11:18 PM, Bolke de Bruin 
> > >> wrote:
> > >>
> > >> > Hey Alex,
> > >> >
> > >> > I agree with you that they are nice to have, but as you mentioned
> they
> > >> are
> > >> > not blockers. As we are moving towards time based releases I suggest
> > >> > marking them for 1.8.2 and cherry-picking them in your production.
> > >> >
> > >> > - Bolke.
> > >> >
> > >> > > On 18 Apr 2017, at 00:02, Alex Guziel
>  > >> D>
> > >> > wrote:
> > >> > >
> > >> > > Sorry about that. FWIW, these were recent and I don't think they
> > were
> > >> > > blockers but are nice to fix. Particularly, the tree one was
> > forgotten
> > >> > > about. I remember seeing it at the Airflow hackathon but I guess I
> > >> forgot
> > >> > > to correct it.
> > >> > >
> > >> > > On Mon, Apr 17, 2017 at 12:17 PM, Chris Riccomini <
> > >> criccom...@apache.org
> > >> > >
> > >> > > wrote:
> > >> > >
> > >> > >> :(:(:( Why was this not included in 1.8.1 JIRA? I've been
> emailing
> > >> the
> > >> > list
> > >> > >> all last week
> > >> > >>
> > >> > >> On Mon, Apr 17, 2017 at 11:28 AM, Alex Guziel <
> > >> > >> alex.guz...@airbnb.com.invalid> wrote:
> > >> > >>
> > >> > >>> I would say to include [1074] (
> > >> > >>> https://github.com/apache/incubator-airflow/pull/2221) so we
> > don't
> > >> > have
> > >> > >> a
> > >> > >>> regression in the release after. I would also say
> > >> > >>> https://github.com/apache/incubator-airflow/pull/2241 is semi
> > >> > important
> > >> > >>> but
> > >> > >>> less so.
> > >> > >>>
> > >> > >>> On Mon, Apr 17, 2017 at 11:24 AM, Chris Riccomini <
> > >> > criccom...@apache.org
> > >> > >>>
> > >> > >>> wrote:
> > >> > >>>
> > >> >  Dear All,
> > >> > 
> > >> >  I have been able to make the Airflow 1.8.1 RC0 available at:
> > >> >  https://dist.apache.org/repos/dist/dev/incubator/airflow,
> public
> > >> keys
> > >> > >>> are
> > >> >  available at https://dist.apache.org/repos/
> > >> > >>> dist/release/incubator/airflow.
> > >> > 
> > >> >  Issues fixed:
> > >> > 
> > >> >  [AIRFLOW-1062] DagRun#find returns wrong result if
> external_trigg
> > >> >  [AIRFLOW-1054] Fix broken import on test_dag
> > >> >  [AIRFLOW-1050] Retries ignored - regression
> > >> >  [AIRFLOW-1033] TypeError: can't compare datetime.datetime to
> None
> > >> >  [AIRFLOW-1030] HttpHook error when creating HttpSensor
> > >> >  [AIRFLOW-1017] get_task_instance should return None instead of
> th
> > >> >  [AIRFLOW-1011] Fix bug in BackfillJob._execute() for SubDAGs
> > >> >  [AIRFLOW-1001] Landing Time shows "unsupported operand type(s)
> fo
> > >> >  [AIRFLOW-1000] Rebrand to Apache Airflow instead of Airflow
> > >> >  [AIRFLOW-989] Clear Task Regression
> > >> >  [AIRFLOW-974] airflow.util.file mkdir has a race condition
> > >> >  [AIRFLOW-906] Update Code icon from lightning bolt to file
> > >> >  [AIRFLOW-858] Configurable database name for DB operators
> > >> >  [AIRFLOW-853] ssh_execute_operator.py stdout decode default to
> A
> > >> >  [AIRFLOW-832] Fix debug server
> > >> >  [AIRFLOW-817] Trigger dag fails when using CLI + API
> > >> >  [AIRFLOW-816] Make sure 

Re: [VOTE] Release Airflow 1.8.1 based on Airflow 1.8.1 RC0

2017-04-18 Thread siddharth anand
Hmn.. it always worked for me for any of the releases we installed. I
install `pip install `

-s

On Tue, Apr 18, 2017 at 10:44 AM, Chris Riccomini 
wrote:

> @Sid, how do you enable the versioning? I've never been able to get this to
> work in my environment. It always shows "Not available", even with 1.8.0.
>
> On Mon, Apr 17, 2017 at 11:18 PM, Bolke de Bruin 
> wrote:
>
> > Hey Alex,
> >
> > I agree with you that they are nice to have, but as you mentioned they
> are
> > not blockers. As we are moving towards time based releases I suggest
> > marking them for 1.8.2 and cherry-picking them in your production.
> >
> > - Bolke.
> >
> > > On 18 Apr 2017, at 00:02, Alex Guziel 
> > wrote:
> > >
> > > Sorry about that. FWIW, these were recent and I don't think they were
> > > blockers but are nice to fix. Particularly, the tree one was forgotten
> > > about. I remember seeing it at the Airflow hackathon but I guess I
> forgot
> > > to correct it.
> > >
> > > On Mon, Apr 17, 2017 at 12:17 PM, Chris Riccomini <
> criccom...@apache.org
> > >
> > > wrote:
> > >
> > >> :(:(:( Why was this not included in 1.8.1 JIRA? I've been emailing the
> > list
> > >> all last week
> > >>
> > >> On Mon, Apr 17, 2017 at 11:28 AM, Alex Guziel <
> > >> alex.guz...@airbnb.com.invalid> wrote:
> > >>
> > >>> I would say to include [1074] (
> > >>> https://github.com/apache/incubator-airflow/pull/2221) so we don't
> > have
> > >> a
> > >>> regression in the release after. I would also say
> > >>> https://github.com/apache/incubator-airflow/pull/2241 is semi
> > important
> > >>> but
> > >>> less so.
> > >>>
> > >>> On Mon, Apr 17, 2017 at 11:24 AM, Chris Riccomini <
> > criccom...@apache.org
> > >>>
> > >>> wrote:
> > >>>
> >  Dear All,
> > 
> >  I have been able to make the Airflow 1.8.1 RC0 available at:
> >  https://dist.apache.org/repos/dist/dev/incubator/airflow, public
> keys
> > >>> are
> >  available at https://dist.apache.org/repos/
> > >>> dist/release/incubator/airflow.
> > 
> >  Issues fixed:
> > 
> >  [AIRFLOW-1062] DagRun#find returns wrong result if external_trigg
> >  [AIRFLOW-1054] Fix broken import on test_dag
> >  [AIRFLOW-1050] Retries ignored - regression
> >  [AIRFLOW-1033] TypeError: can't compare datetime.datetime to None
> >  [AIRFLOW-1030] HttpHook error when creating HttpSensor
> >  [AIRFLOW-1017] get_task_instance should return None instead of th
> >  [AIRFLOW-1011] Fix bug in BackfillJob._execute() for SubDAGs
> >  [AIRFLOW-1001] Landing Time shows "unsupported operand type(s) fo
> >  [AIRFLOW-1000] Rebrand to Apache Airflow instead of Airflow
> >  [AIRFLOW-989] Clear Task Regression
> >  [AIRFLOW-974] airflow.util.file mkdir has a race condition
> >  [AIRFLOW-906] Update Code icon from lightning bolt to file
> >  [AIRFLOW-858] Configurable database name for DB operators
> >  [AIRFLOW-853] ssh_execute_operator.py stdout decode default to A
> >  [AIRFLOW-832] Fix debug server
> >  [AIRFLOW-817] Trigger dag fails when using CLI + API
> >  [AIRFLOW-816] Make sure to pull nvd3 from local resources
> >  [AIRFLOW-815] Add previous/next execution dates to available def
> >  [AIRFLOW-813] Fix unterminated unit tests in tests.job (tests/jo
> >  [AIRFLOW-812] Scheduler job terminates when there is no dag file
> >  [AIRFLOW-806] UI should properly ignore DAG doc when it is None
> >  [AIRFLOW-794] Consistent access to DAGS_FOLDER and SQL_ALCHEMY_C
> >  [AIRFLOW-785] ImportError if cgroupspy is not installed
> >  [AIRFLOW-784] Cannot install with funcsigs > 1.0.0
> >  [AIRFLOW-780] The UI no longer shows broken DAGs
> >  [AIRFLOW-777] dag_is_running is initlialized to True instead of
> >  [AIRFLOW-719] Skipped operations make DAG finish prematurely
> >  [AIRFLOW-694] Empty env vars do not overwrite non-empty config v
> >  [AIRFLOW-139] Executing VACUUM with PostgresOperator
> >  [AIRFLOW-111] DAG concurrency is not honored
> >  [AIRFLOW-88] Improve clarity Travis CI reports
> > 
> >  I would like to raise a VOTE for releasing 1.8.1 based on release
> > >>> candidate
> >  0, i.e. just renaming release candidate 0 to 1.8.1 release.
> > 
> >  Please respond to this email by:
> > 
> >  +1,0,-1 with *binding* if you are a PMC member or *non-binding* if
> you
> > >>> are
> >  not.
> > 
> >  Vote will run for 72 hours (ends this Thursday).
> > 
> >  Thanks!
> >  Chris
> > 
> >  My VOTE: +1 (binding)
> > 
> > >>>
> > >>
> >
> >
>


Re: [VOTE] Release Airflow 1.8.1 based on Airflow 1.8.1 RC0

2017-04-18 Thread Bolke de Bruin
Hey Alex,

I agree with you that they are nice to have, but as you mentioned they are not 
blockers. As we are moving towards time based releases I suggest marking them 
for 1.8.2 and cherry-picking them in your production. 

- Bolke.

> On 18 Apr 2017, at 00:02, Alex Guziel  wrote:
> 
> Sorry about that. FWIW, these were recent and I don't think they were
> blockers but are nice to fix. Particularly, the tree one was forgotten
> about. I remember seeing it at the Airflow hackathon but I guess I forgot
> to correct it.
> 
> On Mon, Apr 17, 2017 at 12:17 PM, Chris Riccomini 
> wrote:
> 
>> :(:(:( Why was this not included in 1.8.1 JIRA? I've been emailing the list
>> all last week
>> 
>> On Mon, Apr 17, 2017 at 11:28 AM, Alex Guziel <
>> alex.guz...@airbnb.com.invalid> wrote:
>> 
>>> I would say to include [1074] (
>>> https://github.com/apache/incubator-airflow/pull/2221) so we don't have
>> a
>>> regression in the release after. I would also say
>>> https://github.com/apache/incubator-airflow/pull/2241 is semi important
>>> but
>>> less so.
>>> 
>>> On Mon, Apr 17, 2017 at 11:24 AM, Chris Riccomini >> 
>>> wrote:
>>> 
 Dear All,
 
 I have been able to make the Airflow 1.8.1 RC0 available at:
 https://dist.apache.org/repos/dist/dev/incubator/airflow, public keys
>>> are
 available at https://dist.apache.org/repos/
>>> dist/release/incubator/airflow.
 
 Issues fixed:
 
 [AIRFLOW-1062] DagRun#find returns wrong result if external_trigg
 [AIRFLOW-1054] Fix broken import on test_dag
 [AIRFLOW-1050] Retries ignored - regression
 [AIRFLOW-1033] TypeError: can't compare datetime.datetime to None
 [AIRFLOW-1030] HttpHook error when creating HttpSensor
 [AIRFLOW-1017] get_task_instance should return None instead of th
 [AIRFLOW-1011] Fix bug in BackfillJob._execute() for SubDAGs
 [AIRFLOW-1001] Landing Time shows "unsupported operand type(s) fo
 [AIRFLOW-1000] Rebrand to Apache Airflow instead of Airflow
 [AIRFLOW-989] Clear Task Regression
 [AIRFLOW-974] airflow.util.file mkdir has a race condition
 [AIRFLOW-906] Update Code icon from lightning bolt to file
 [AIRFLOW-858] Configurable database name for DB operators
 [AIRFLOW-853] ssh_execute_operator.py stdout decode default to A
 [AIRFLOW-832] Fix debug server
 [AIRFLOW-817] Trigger dag fails when using CLI + API
 [AIRFLOW-816] Make sure to pull nvd3 from local resources
 [AIRFLOW-815] Add previous/next execution dates to available def
 [AIRFLOW-813] Fix unterminated unit tests in tests.job (tests/jo
 [AIRFLOW-812] Scheduler job terminates when there is no dag file
 [AIRFLOW-806] UI should properly ignore DAG doc when it is None
 [AIRFLOW-794] Consistent access to DAGS_FOLDER and SQL_ALCHEMY_C
 [AIRFLOW-785] ImportError if cgroupspy is not installed
 [AIRFLOW-784] Cannot install with funcsigs > 1.0.0
 [AIRFLOW-780] The UI no longer shows broken DAGs
 [AIRFLOW-777] dag_is_running is initlialized to True instead of
 [AIRFLOW-719] Skipped operations make DAG finish prematurely
 [AIRFLOW-694] Empty env vars do not overwrite non-empty config v
 [AIRFLOW-139] Executing VACUUM with PostgresOperator
 [AIRFLOW-111] DAG concurrency is not honored
 [AIRFLOW-88] Improve clarity Travis CI reports
 
 I would like to raise a VOTE for releasing 1.8.1 based on release
>>> candidate
 0, i.e. just renaming release candidate 0 to 1.8.1 release.
 
 Please respond to this email by:
 
 +1,0,-1 with *binding* if you are a PMC member or *non-binding* if you
>>> are
 not.
 
 Vote will run for 72 hours (ends this Thursday).
 
 Thanks!
 Chris
 
 My VOTE: +1 (binding)
 
>>> 
>>