Re: [VOTE] Graduate the Apache Airflow as a TLP

2018-11-30 Thread Stefan Seelmann
+1 (non-binding) On 11/30/18 10:33 PM, Jakob Homan wrote: > Hey all! > > Following a very successful DISCUSS[1] regarding graduating Airflow to > Top Level Project (TLP) status, I'm starting the official VOTE. > > Since entering the Incubator in 2016, the community has: >* successfully produ

Re: [DISCUSS] Apache Airflow graduation from the incubator

2018-11-26 Thread Stefan Seelmann
I agree that Apache Airflow should graduate. I'm only involved since beginning of this year, but the project did two releases during that time, once TLP releasing becomes easier :) Regarding QU30 you may consider to use the ASF wide security mailing list [3] and process [4]. Kind Regards, Stefan

Re: It's very hard to become a committer on the project

2018-09-22 Thread Stefan Seelmann
On 9/20/18 10:02 PM, Driesprong, Fokko wrote: > us still have a full time job on the side :) Tomorrow I'll spend time to > clean up the old Jira's. My cold prevents me for doing creative things, so I went through Jira and picked the low hanging fruits. The following issues can be closed IMHO: Spa

Re: Duplicate key unique constraint error

2018-09-17 Thread Stefan Seelmann
On 9/17/18 8:19 PM, Abhishek Sinha wrote: > Any update on this? > >> Please find the scheduler error log attached. >> >> Can you share the full python stack trace? Seems the mailing list doesn't allow attachments. Either post the stacktrace inline, or post it somewhere at pastebin or so.

Database referral integrity

2018-09-17 Thread Stefan Seelmann
Hi, looking into the DB schema there is almost no referral integrity enforced at the database level. Many foreign key constraints between dag, dag_run, task_instance, xcom, dag_pickle, log, etc would make sense IMO. Is there a particular reason why that's not implemented? Introducing it now will

Re: Plugin Support for RBAC GUI

2018-09-15 Thread Stefan Seelmann
On 9/14/18 10:45 PM, Ian Davison wrote: > My team has been working on adding more plugin views for Airflow’s GUI. With > the release of 1.10 we’ve been looking into the new Flask AppBuilder GUI for > RBAC. I see now that the RBAC GUI doesn’t support user defined views via the > plugin in manager

Re: ExternalTaskSensor alternatives

2018-08-22 Thread Stefan Seelmann
On 08/22/2018 06:56 PM, Tao Feng wrote: > FYI, there is an existing pr and proposal for improving sensor efficiency( > https://issues.apache.org/jira/browse/AIRFLOW-2747 and > https://github.com/apache/incubator-airflow/pull/3596/files) by the > community. And I hope I'll find some time next week

Re: Airflow talk at Munich Data Engineering Meetup

2018-07-24 Thread Stefan Seelmann
On 07/24/2018 08:55 AM, Bolke de Bruin wrote: > Great stuff! Can you make sure to at least once use “Apache Airflow > (incubating)” in the text, preferably at the beginning of the paragraph? That > would be greatly appreciated. Sure, done, also in the slides. Kind Regards, Stefan

Airflow talk at Munich Data Engineering Meetup

2018-07-23 Thread Stefan Seelmann
Hi all, I'll give a talk about Airflow at the next Data Engineering Meetup in Munich (Germany) on next Thursday the 26th. Maybe some folks from the Munich area are interested. Details at [1]. Kind Regards, Stefan [1] https://www.meetup.com/data-engineering-munich/events/252170998/

[Proposal] Explicit re-schedule of sensors

2018-07-12 Thread Stefan Seelmann
Hi all, I'd like to discuss a proposal to enable explicit re-scheduling of sensors. I think there is demand for such a thing, in the last weeks multiple people asked for it or mentioned workarounds. I created a Jira [1] that describes the proposal and an initial PR [2]. Feedback welcomed :-) Ki

Re: Using large numbers of sensors, resource consumption

2018-07-10 Thread Stefan Seelmann
I also have that requirement and I'm working on a proposal for rescheduling tasks. My current PoC can be found at [1] which uses up_for_retry state which has some problems. I started to make some changes, I hope can make a first proposal this week. The basic idea is: * A new "reschedule" flag for

Re: [VOTE] Airflow 1.10.0rc1

2018-07-10 Thread Stefan Seelmann
+1 (non-binding) * Verified checksums and signatures of the packages * Checked license and notice files * Run the tests * Installed from git tag and run some example DAGs Two minor findings: * In airflow/api/auth/backend/kerberos_auth.py the original license header was replaced with the ASF one,

Re: Apache Airflow 1.10.0b2

2018-06-25 Thread Stefan Seelmann
I noticed one bug, reported in https://issues.apache.org/jira/browse/AIRFLOW-2639, patch is available, would be nice if that could go into the release. Kind Regards, Stefan On 06/25/2018 10:14 PM, Bolke de Bruin wrote: > Thanks for the responses. I will check if this warrants a beta 3 or if we ar

Avoid sensor sleep by rescheduling task, was: Re: How to wait for external process

2018-06-02 Thread Stefan Seelmann
I digged a bit into the Airflow code and I think I found a possible solution, see draft at [1]: Add a "reschedule" flag to BaseSensorOperator, when set it doesn't sleep but raises an AirflowRescheduleTask exception. Within the TaskInstance this exception is handled, similar to a failure. The task s

Re: How to wait for external process

2018-05-28 Thread Stefan Seelmann
; now, I can think of, is doing updating the state directly in the database. >> But then you need to know what you are doing. I can image that this would >> be feasible by using an AWS lambda function. Hope this helps. >> >> Cheers, Fokko >> >> 2018-05-26 17:50 GMT+0

How to wait for external process

2018-05-26 Thread Stefan Seelmann
Hello, I have a DAG (externally triggered) where some processing is done at an external system (EC2 instance). The processing is started by an Airflow task (via HTTP request). The DAG should only continue once that processing is completed. In a first naive implementation I created a sensor that ge