AIP-13: OpenAPI 3 based API definition

2019-02-01 Thread drew . sonne
Hi all! I've been doing a couple of little PR's here and there on the airflow project, and I'd like to propose some slightly bigger changes. The summary is that I would like to propose using Swagger to define the airflow API and as a wrapper for the endpoint handlers, passing off most of the he

Re: AIP-12 Persist DAG into DB

2019-02-01 Thread Dan Davydov
@Max What I've been thinking about recently is creating an abstraction for the serialization process. I think in general it makes sense to have for e.g. dynamic DAGs to have a service that periodically serializes DAGs and uploads them to e.g. a database via some new Airflow DAG Uploader Service. Th

Re: AIP-12 Persist DAG into DB

2019-02-01 Thread Ben Tallman
In my experience, there are two major wins to chase here. Neither are simple, nor is this the first discussion around them. In the past there was an attempt to use Pickling to handle these challenges. The first is that with dynamic dags (they are evaluated as python code after all), it is possible

Re: Proposal to remove json_client

2019-02-01 Thread Bolke de Bruin
The airlow cli can function as this rest client and does not need to be at the same server as where airflow is running. Direct DB access is bad from a separation of concerns perspective as you can change task statusses, insert arbitrary things etc. B. Sent from my iPhone > On 1 Feb 2019, at

Help SparkJDBCOperator

2019-02-01 Thread Iván Robla Albarrán
Hi , I am seaching how to substitute Apache Sqoop I am analyzing SparkJDBCOperator, but i dont understand how i have to use . It a version of SparkSubmit operator, for include as conection JDBC conection ? I need to include Spark code? Any example? Thanks, I am very lost Regards, Iván Robl

Re: Proposal to remove json_client

2019-02-01 Thread David Cavaletto
Actually the opposite. If we're going to have a REST API, users should interact with it over http(s), using a rest client. If a user has SSH access to the server running airflow I don't see the security concern of having the CLI access the DB in the same manner the REST API does. In my diagram all

Re: Trigger DagRun Task based on notification

2019-02-01 Thread Bas Harenslak
The rescheduling sensors are available in Airflow 1.10.2 and can be used by setting argument mode=“reschedule” in your sensor. Cheers, Bas > On 1 Feb 2019, at 10:57, raman gupta wrote: > > Thanks Fokko, > We are exploring K8executor, But the number of such long running jobs would > be in 1000(

Re: Trigger DagRun Task based on notification

2019-02-01 Thread raman gupta
Thanks Fokko, We are exploring K8executor, But the number of such long running jobs would be in 1000(s). So having some non-blocking mechanism would help. Rescheduling in sensors sounds good. Will explore it. Is it available in Airflow 1.10.1. Thanks, Raman Gupta On Fri, Feb 1, 2019 at 2:57 PM Dr

Re: Trigger DagRun Task based on notification

2019-02-01 Thread Driesprong, Fokko
Hi Raman, Right now this is the way to go. Recently there has been a change to the sensor, in which it will be rescheduled instead of blocking. So this is something that you might want to explore. Otherwise, you might want to choose a more scalable executer such as the Celery or Kubernetes execut