Airflow Multi-Tenancy to filter dags by team not users

2018-08-09 Thread ayush . chauhan
Hi, I am trying to implement airflow multi-tenancy and Google authentication in airflow. I am using airlfow 1.9.0. I am having the following doubts about implementing it:- 1) How to filter dags by team instead of individual users. I know a workaround for this will creating a user for each tea

Re: [VOTE] Airflow 1.10.0rc4

2018-08-09 Thread Naik Kaxil
+1 (binding) Tested it on Python2.7 with flask UI On 09/08/2018, 23:58, "Daniel Imberman" wrote: -1 (non-binding), There is a k8s bug fix that should be PRed by @jordan.zucker this weekend (relating to the tracking of resourceVersions). There have also been multiple users

Re: [VOTE] Airflow 1.10.0rc4

2018-08-09 Thread Daniel Imberman
-1 (non-binding), There is a k8s bug fix that should be PRed by @jordan.zucker this weekend (relating to the tracking of resourceVersions). There have also been multiple users requesting the ability to pre-bake docker images which I will make a PR for this weekend. On Thu, Aug 9, 2018 at 11:22 AM

Podling Report Reminder - August 2018

2018-08-09 Thread jmclean
Dear podling, This email was sent by an automated system on behalf of the Apache Incubator PMC. It is an initial reminder to give you plenty of time to prepare your quarterly board report. The board meeting is scheduled for Wed, 15 August 2018, 10:30 am PDT. The report for your podling will form

Re: apache-airflow v1.10.0 on PyPi?

2018-08-09 Thread Krish Sigler
Got it, will use the mailing list in the future. Thanks for the info On Thu, Aug 9, 2018 at 2:42 PM, Bolke de Bruin wrote: > Hi Kris, > > Please use the mailing list for these kind of questions. > > Airflow 1.10.0 hasn’t been released yet. We are going through the motions, > but it will take a

Re: apache-airflow v1.10.0 on PyPi?

2018-08-09 Thread Bolke de Bruin
Hi Kris, Please use the mailing list for these kind of questions. Airflow 1.10.0 hasn’t been released yet. We are going through the motions, but it will take a couple of days before it’s official (if all goes well). B. Verstuurd vanaf mijn iPad > Op 9 aug. 2018 om 23:33 heeft Krish Sigler he

Re: Broken DAG message won't go away in webserver

2018-08-09 Thread Alex Guziel
IIRC the scheduler sets these messages in the error table in the db. On Thu, Aug 9, 2018 at 2:13 PM, Ben Laird wrote: > The messages persist even after restarting the webserver. I've verified > with other airflow users in the office that they'd have to manually delete > records from the 'import_

Re: Broken DAG message won't go away in webserver

2018-08-09 Thread Ben Laird
The messages persist even after restarting the webserver. I've verified with other airflow users in the office that they'd have to manually delete records from the 'import_error' table. When you say 'sync your DAGs', what do you mean exactly? When we fix a DAG, we'd normally kill the webserver pro

Re: Broken DAG message won't go away in webserver

2018-08-09 Thread Taylor Edmiston
Yeah, you definitely shouldn't need to do a resetdb for that. Did you try restarting the webserver? How do you sync your DAGs to the webserver? Is it possible the fixed DAG didn't get synced there? For me, IIRC, the error stops persisting once the DAG is fixed and synced. *Taylor Edmiston* Blo

Re: Plan to change type of dag_id from String to Number?

2018-08-09 Thread Maxime Beauchemin
The change on perf for the DAG table would be extremely negligible. Maybe for task_instances (large table with millions of rows, 3 fields composite key) it *could* be a decent idea. Though you'd then need to have two indexes to store and maintain and we may have to change the code to actually use

Broken DAG message won't go away in webserver

2018-08-09 Thread Ben Laird
Hello - I've noticed this several times and not sure what the solution is. If I have a DAG error at some point, I'll see message in the webserver that says "Broken DAG: [Error]". However, after fixing the code, restarting the webserver, etc, the error persists. After closing it out, it will just p

Re: [VOTE] Airflow 1.10.0rc4

2018-08-09 Thread Bolke de Bruin
0.5?? Can we score fractions :-) ? Sorry I missed this Ash. I think Fokko really wants a 1.10.1 quickly so better include it then? Can you make your vote +1? Thx Bolke > On 9 Aug 2018, at 14:06, Ash Berlin-Taylor wrote: > > +0.5 (binding) from me. > > Tested upgrading form 1.9.0 metadb on Py

Re: Modeling rate limited api calls in airflow

2018-08-09 Thread Gerard Toonstra
Have you looked into pools? Pools allow you to specify how many tasks at any given time should use a common resource. That way you could limit this to 1, 2, or 3 for example. Pools are not dynamic however, so it only allows you to upper limit how many number of clients are going to hit the API at

Re: [VOTE] Airflow 1.10.0rc4

2018-08-09 Thread Ash Berlin-Taylor
+0.5 (binding) from me. Tested upgrading form 1.9.0 metadb on Py3.5. Timezones behaving themselves on Postgres. Have not tested the Rbac-based UI. https://github.com/apache/incubator-airflow/commit/d9fecba14c5eb56990508573a91b13ab27ca5153

Re: Plan to change type of dag_id from String to Number?

2018-08-09 Thread Vardan Gupta
Point well taken on backward compatibility, we will have to take this change very diligently, if implemented. On Thu, Aug 9, 2018 at 7:29 PM Юли Волкова wrote: > Because in case what you described nothing about backward compatibility. > You think what all who use need to change all theirs DAG's?

Re: Plan to change type of dag_id from String to Number?

2018-08-09 Thread Vardan Gupta
Absolutely, I'll work on producing some results. Also, it's not just a matter of joining table, even pointed queries on individual tables like task_instance, dag_run, fag_failure will be faster with integer identifier. On Thu, Aug 9, 2018 at 7:59 PM Ash Berlin-Taylor wrote: > Since this is a big

Re: Plan to change type of dag_id from String to Number?

2018-08-09 Thread Ash Berlin-Taylor
Since this is a big change that would touch much of the code base, before we do this we need to see some hard numbers - timing or benchmarks of queries etc. Also how often do we actually do such a join etc? -ash > On 9 Aug 2018, at 13:04, vardangupta...@gmail.com >

Re: Plan to change type of dag_id from String to Number?

2018-08-09 Thread Юли Волкова
Because in case what you described nothing about backward compatibility. You think what all who use need to change all theirs DAG's? It's very strange, because you propose one of the most critical change and it will side everyone. If you want id - call it dag_metadata_id and add it. But not propose

Re: [VOTE] Airflow 1.10.0rc4

2018-08-09 Thread Driesprong, Fokko
Good point Bolke, Sid, seems that there are still a few issues with Tenacity as well , therefore I would like to change my vote: +1 (binding) Cheers, Fokko 2018-08-09 14:08 GMT+02:00 Ash Berlin-Taylor : > +0.5 (binding) from me. > > Tested

Re: [VOTE] Airflow 1.10.0rc4

2018-08-09 Thread Ash Berlin-Taylor
+0.5 (binding) from me. Tested upgrading form 1.9.0 metadb on Py3.5. Timezones behaving themselves on Postgres. Have not tested the Rbac-based UI. https://github.com/apache/incubator-airflow/commit/d9fecba14c5eb56990508573a91b13ab27ca5153

Re: Plan to change type of dag_id from String to Number?

2018-08-09 Thread vardanguptacse
On 2018/08/09 11:55:11, Ash Berlin-Taylor wrote: > Absolutely - there will still need to be a human-readable DAG id, even we end > up with an auto-icrementing integer ID column internally and for table join > performance reasons. > > -ash > > > On 9 Aug 2018, at 12:35, Юли Волкова wrote:

Re: Plan to change type of dag_id from String to Number?

2018-08-09 Thread Ash Berlin-Taylor
Absolutely - there will still need to be a human-readable DAG id, even we end up with an auto-icrementing integer ID column internally and for table join performance reasons. -ash > On 9 Aug 2018, at 12:35, Юли Волкова wrote: > > How will you understand what your DAG 2 doing enter to it?

Re: Plan to change type of dag_id from String to Number?

2018-08-09 Thread Юли Волкова
How will you understand what your DAG 2 doing enter to it? For each of 100, for example? Especially, if you are not a developer, who create it. You are a support team and have 120 DAGs. The first time, when want to also send the answer to dev-mail list. Please, don't do it. I think it's will

Re: Identifying delay between schedule & run instances

2018-08-09 Thread vardanguptacse
On 2018/08/09 06:27:30, Bolke de Bruin wrote: > Hi vardang, > > What do you intent to gain from this metric? There are many influences that > influence a difference between execution date and start date. You named one > of them, but there are also functional ones (limits reached etc). We ar

Re: Plan to change type of dag_id from String to Number?

2018-08-09 Thread vardanguptacse
On 2018/08/09 06:29:45, Tao Feng wrote: > +1 on Bolke. I don't think we have such plan. And I believe dag id has been > indexed already in many tables. > > On Wed, Aug 8, 2018 at 11:22 PM, Bolke de Bruin wrote: > > > No we don’t have such plan. Dag ids are used to have a readable > > identi

Re: SubdagOperator and Pools

2018-08-09 Thread Andreas Koeltringer
Hi, to clarify, I created a Gist with instructions for how to reproduce this issue: https://gist.github.com/akoeltringer/63fcf0340ae219c112b2a5377e6d2715 thanks, regards Andreas On 08/09/2018 07:41 AM, Andreas Koeltringer wrote: Hi Tao, thanks for your response. That's just the thing: I

Modeling rate limited api calls in airflow

2018-08-09 Thread rob
Hello, I am in the process of migrating a bespoke data pipe line built around celery into airflow. We have a number of different tasks which interact with the Adwords API which has a rate limiting policy. The policy isn't a fixed number of requests its variable. In our celery code we have han

Re: Custom authentication with RBAC

2018-08-09 Thread Ravi Kotecha
Hi Gabriel, We have extended the auth backend for FAB to support OpenIDConnect here: https://github.com/ministryofjustice/fab-oidc and you can see how to configure it in our helm chart