Trigger DagRun Task based on notification

2019-01-31 Thread ramandumcs
Hi All, In our workflows we trigger big data jobs which run from few hours to few days. Currently our Airflow operator submits the job and keeps on polling its status. Depending upon its status next task in the workflow is triggered by Airflow scheduler. So currently operator is not doing any u

Re: Proposal to remove json_client

2019-01-31 Thread Bolke de Bruin
Hi David, I assume that you then want to ensure that all calls from the CLI are being made through the Rest-API. In other words the json_client would be the only client remaining. The local_client is a security problem as it needs direct database access which makes it problematic to have normal

Re: AIP-12 Persist DAG into DB

2019-01-31 Thread Maxime Beauchemin
Right, it's been discussed extensively in the past and the main thing needed to get to a "stateless web server" (or at least a DagBag-free web server) is to drop the template rendering in the UI. Also we might need little workarounds (we'd have to dig in to check) around deleting task instances or

Re: Custom scheduler support in Airflow

2019-01-31 Thread Brian Greene
I’d agree with Ash as well - and the externally triggered Dag model works well and still allows you to use airflow for “normal” scheduled tasks. Admittedly we struggled with this for a while, working really hard to use schedules and xcom etc.. this is really “state management” in my opinion, an

Re: Proposal to remove json_client

2019-01-31 Thread David Cavaletto
In case my images didn't come through, here is a link that shows what I'd like to remove (in red) and add (in green) https://docs.google.com/drawings/d/1Ux8qGQUdRp2L6YWgqayliIzFiJv1nwev4JaG2rR7oiQ/edit?usp=sharing On Thu, Jan 31, 2019 at 8:21 PM David Cavaletto wrote: > Hello, > > While trying

Proposal to remove json_client

2019-01-31 Thread David Cavaletto
Hello, While trying to refactor the Connection API (both REST and CLI) I've discovered the CLI has the ability to make REST API calls. The flow looks like this: [image: Apache Airflow API flow.jpg] In a discussion with Ash in Slack, he said this architecture is so that a user can use a local inst

Re: Usage of airflow/contrib/auth/backends/password_auth.py

2019-01-31 Thread Bolke de Bruin
Kerberos auth also.works with the API. So let's properly remove it. Sent from my iPhone > On 31 Jan 2019, at 14:52, Deng Xiaodong wrote: > > Thanks Niels. > > In this case, one solution we’re thinking of is to refactor this module to > make it work with the `ab_user` table (currently it’s wo

Re: Custom scheduler support in Airflow

2019-01-31 Thread abhishek sharma
> Have you looked into TriggerDagRunOperator [@Tao], I am aware of it, and in my opinion, it's same if you use an operator or invoking trigger from the command line. @Ash, I agree that this is not an easy task and it requires a lot of effort. As I said previously; *first we have to decide whether

Re: Custom scheduler support in Airflow

2019-01-31 Thread Tao Feng
If I understand your request correctly, I don't think you need a custom scheduler but have a custom way to create dag run which is non-time based? Have you looked into TriggerDagRunOperator ( https://github.com/apache/airflow/blob/master/airflow/operators/dagrun_operator.py#L36 )? On Thu, Jan 31,

Re: Custom scheduler support in Airflow

2019-01-31 Thread Ash Berlin-Taylor
> wouldn't it be easy if we have some custom scheduler support in Airflow Don't underestimate JUST how much work this would actually involve. Right now given the solutions presented, and the ability to trigger DAGs in Airflow via the existing API I am not convinced that Airflow needs the added

Re: Custom scheduler support in Airflow

2019-01-31 Thread abhishek sharma
Thanks, Brian & Ben. So, you guys also have such workflows and through Sensors or running DAGs frequently things are working out for your guys. In my case, I am running an application which works as a 'custom scheduler' and triggers DAGs based on event occurrence. Question to you guys, wouldn't i

Re: Custom scheduler support in Airflow

2019-01-31 Thread Ben Tallman
To solve that exact problem, we ran a DAG on a frequent schedule, that basically acted as a scheduler. It used a shell script to kick off other DAGS. Possibly a custom scheduler would be a more elegant solution. Thanks, Ben -- Ben Tallman - 503.680.5709 On Thu, Jan 31, 2019 at 11:03 AM abhishek

Re: Custom scheduler support in Airflow

2019-01-31 Thread Brian Greene
This is actually the majority of our airflow work. We use this pattern: Sensor pings api (fairly quickly, in a dag that’s constrained to only run one instance, every minute) If the sensor gets a valid response, the next task is a custom operator that extends Trigger, builds up the DagRun contex

Re: Custom scheduler support in Airflow

2019-01-31 Thread abhishek sharma
Hi Ben, Just copying my comment form ticket. I think current airflow scheduler schedule DAGs only on time-basis (based on cron schedule string). ***Is it correct understanding?* How to approach a scenario where I want to trigger a DAG based on some event which is not so predictable/regular on ti

Re: AIP-12 Persist DAG into DB

2019-01-31 Thread Dan Davydov
Agreed on complexities (I think deprecating Jinja templates for webserver rendering is one thing), but I'm not sure I understand on the falling down on code changes part, mind providing an example? On Thu, Jan 31, 2019 at 12:22 PM Ash Berlin-Taylor wrote: > That sounds like a good idea at first,

Re: AIP-12 Persist DAG into DB

2019-01-31 Thread Ash Berlin-Taylor
That sounds like a good idea at first, but falls down with possible code changes in operators between one task and the next. (I would like this, but there are definite complexities) -ash On 31 January 2019 16:56:54 GMT, Dan Davydov wrote: >I feel the right higher-level solution to this probl

Re: Custom scheduler support in Airflow

2019-01-31 Thread Ben Tallman
Can you explain a bit more what you are thinking for a custom scheduler? It's been awhile, but we added support for cron schedules without backfill awhile back, so I'm wondering what you are thinking of adding with this? Thanks, Ben -- Ben Tallman - 503.680.5709 On Thu, Jan 31, 2019 at 8:29 AM

Re: AIP-12 Persist DAG into DB

2019-01-31 Thread Dan Davydov
I feel the right higher-level solution to this problem (which is "Adding Consistency to Airflow") is DAG serialization, that is all DAGs should be represented as e.g. JSON (similar to the current SimpleDAGBag object used by the Scheduler). This solves the webserver issue, and also adds consistency

Custom scheduler support in Airflow

2019-01-31 Thread abhishek sharma
Hi All, Created a ticket(https://issues.apache.org/jira/browse/AIRFLOW-3775) for supporting custom scheduler in Airflow. The idea is to have a scheduler base class which can be extended for writing a custom scheduler. The logic of custom scheduling is user specific, and at the DAGs task level we

Re: AIP-12 Persist DAG into DB

2019-01-31 Thread airflowuser
I guess it's the same as: https://issues.apache.org/jira/browse/AIRFLOW-2619 However there is another related (?) issue: https://issues.apache.org/jira/browse/AIRFLOW-2637 When updating some of the DAG properties the DAG is flaky for a while until it stabilize. possibly this also related: https

AIP-12 Persist DAG into DB

2019-01-31 Thread Peter van ‘t Hof
Hi All, As most of you guys know, airflow got an issue when loading new dags where the webserver sometimes sees it and sometimes not. Because of this we did wrote this AIP to solve this issue: https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-12+Persist+DAG+into+DB Any feedback is welcome.

Re: Usage of airflow/contrib/auth/backends/password_auth.py

2019-01-31 Thread Deng Xiaodong
Thanks Niels. In this case, one solution we’re thinking of is to refactor this module to make it work with the `ab_user` table (currently it’s working based on the `user` table. It may not make much sense to maintain both `user` and `ab_user` table, at least to me personally). XD > On 31 Jan

Re: Usage of airflow/contrib/auth/backends/password_auth.py

2019-01-31 Thread Deng Xiaodong
No, please do voice out if you have any concern or suggestion ;-) XD > On 31 Jan 2019, at 9:45 PM, Shah Altaf wrote: > > My mistake, yes it's just for UI. Not for API. I'll be quiet now :-) > > > > On Thu, Jan 31, 2019 at 1:42 PM Deng Xiaodong wrote: > >> Hi Shah, >> >> Thanks for your

Re: Usage of airflow/contrib/auth/backends/password_auth.py

2019-01-31 Thread Niels Zeilemaker
We are using it to secure the API, to allow external processes to trigger dags. The airflow instances have a public ip, and hence we needed to secure the API in this manner. Niels Op do 31 jan. 2019 14:45 schreef Shah Altaf My mistake, yes it's just for UI. Not for API. I'll be quiet now :-) >

Re: Usage of airflow/contrib/auth/backends/password_auth.py

2019-01-31 Thread Shah Altaf
My mistake, yes it's just for UI. Not for API. I'll be quiet now :-) On Thu, Jan 31, 2019 at 1:42 PM Deng Xiaodong wrote: > Hi Shah, > > Thanks for your reply. > > May I confirm that you mean that you’re using it for the UI or for the API > authentication? What we discuss here is only the AP

Re: Usage of airflow/contrib/auth/backends/password_auth.py

2019-01-31 Thread Deng Xiaodong
Hi Shah, Thanks for your reply. May I confirm that you mean that you’re using it for the UI or for the API authentication? What we discuss here is only the API authentication. If you meant UI authentication, actually the non-RBAC UI is already deprecated in master branch, as I shared (meaning

Re: Usage of airflow/contrib/auth/backends/password_auth.py

2019-01-31 Thread Shah Altaf
Hello, yes we are using password_auth for all of our Airflow installations. We specifically use it to automate creation of many users. On Thu, Jan 31, 2019 at 1:08 PM Deng Xiaodong wrote: > Hi folks, > > As you may have noticed, the Flask-Admin based UI (non-RBAC) was already > deprecated in

Usage of airflow/contrib/auth/backends/password_auth.py

2019-01-31 Thread Deng Xiaodong
Hi folks, As you may have noticed, the Flask-Admin based UI (non-RBAC) was already deprecated in the master branch. Some works are going on to further clean the codebase. In the process, we found that there are some “legacy" modules in “airflow/contrib/auth/backends/

Re: Draft of AIP - 11 Adding a new landing page

2019-01-31 Thread Kamil Breguła
Hello, Currently, Airflow has already chosen the hosting server. My guess is that this is an infrastructure managed by Apache. Airflow also has a repository created for the website. https://github.com/apache/airflow-site Thanks in advance Kamil Bregula On Thu, Jan 31, 2019 at 12:09 AM Aizhamal N