Re: [DISCUSS] Packaging DAG/operator dependencies in wheels

2019-12-23 Thread Ash Berlin-Taylor
Kaxil and I are planning on tackling versioning (for display only right now, as it's the first step on this journey) as part of AIP-24. However the issue with versioning the _entire_ DAG code/environment is sometimes you want to run in the old env, but sometimes you want to run in a new/latest

Re: [DISCUSS] Packaging DAG/operator dependencies in wheels

2019-12-22 Thread Claudio
Beauchemin Oggetto: Re: [DISCUSS] Packaging DAG/operator dependencies in wheels Probably it is a good time to revisithttps://cwiki.apache.org/confluence/display/AIRFLOW/AIP-5+Remote+DAG+Fetcher again?On Sun, Dec 22, 2019 at 12:16 PM Jarek Potiuk wrote:> I also love the idea of DAG fetcher, It fi

Re: [DISCUSS] Packaging DAG/operator dependencies in wheels

2019-12-22 Thread Chao-Han Tsai
Probably it is a good time to revisit https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-5+Remote+DAG+Fetcher again? On Sun, Dec 22, 2019 at 12:16 PM Jarek Potiuk wrote: > I also love the idea of DAG fetcher, It fits very well the "Python-centric" > rather than "Container-centric" approach

Re: [DISCUSS] Packaging DAG/operator dependencies in wheels

2019-12-22 Thread Jarek Potiuk
I also love the idea of DAG fetcher, It fits very well the "Python-centric" rather than "Container-centric" approach. Fetching it from different sources like local/ .zip and then .wheel seems like an interesting approach. I think the important parts of whatever approach we come up with are: - make

Re: [DISCUSS] Packaging DAG/operator dependencies in wheels

2019-12-22 Thread Tomasz Urbaszek
I like the idea of a DagFetcher (https://github.com/apache/airflow/pull/3138). I think it's a good and simple starting point to fetch .py files from places like local file system, S3 or GCS (that's what Composer actually do under the hood). As the next step we can think about wheels, zip and other

Re: [DISCUSS] Packaging DAG/operator dependencies in wheels

2019-12-21 Thread Jarek Potiuk
I am in "before-Xmas" mood so I thought I will write more of my thoughts about it :). *TL;DR; I try to reason (mostly looking at it from the philosophy/usage point of view) why container-native approach might not be best for Airflow and why we should go python-first instead.* I also used to be in

Re: [DISCUSS] Packaging DAG/operator dependencies in wheels

2019-12-21 Thread Ash Berlin-Taylor
> For the docker example, you'd almost want to inject or "layer" the DAG script and airflow package at run time. Something sort of like Heroku build packs? -a On 20 December 2019 23:43:30 GMT, Maxime Beauchemin wrote: >This reminds me of the "DagFetcher" idea. Basically a new abstraction >that

Re: [DISCUSS] Packaging DAG/operator dependencies in wheels

2019-12-20 Thread Maxime Beauchemin
This reminds me of the "DagFetcher" idea. Basically a new abstraction that can fetch a DAG object from anywhere and run a task. In theory you could extend it to do "zip on s3", "pex on GFS", "docker on artifactory" or whatever makes sense to your organization. In the proposal I wrote about using a

Re: [DISCUSS] Packaging DAG/operator dependencies in wheels

2019-12-16 Thread Dan Davydov
The zip support is a bit of a hack and was a bit controversial when it was added. I think if we go down the path of supporting more DAG sources, we should make sure we have the right interface in place so we avoid the current `if format == zip then: else:` and make sure that we don't tightly couple

Re: [DISCUSS] Packaging DAG/operator dependencies in wheels

2019-12-16 Thread Björn Pollex
Hi Jarek, This sounds great. Is this possibly related to the work started in https://github.com/apache/airflow/pull/730? I'm not sure I’m following your proposal entirely. Initially, what would be a great first step would be to support loading DAGs

Re: [DISCUSS] Packaging DAG/operator dependencies in wheels

2019-12-14 Thread Kamil Breguła
Hello, I heard that one team wants to separate DagProcessor from Scheduler in the future. If this division is done along with the separation of some classes abstraction/plugin mechanism. One plugin can be based on separate Python environments and another on containers. Recently I did refactoring w

[DISCUSS] Packaging DAG/operator dependencies in wheels

2019-12-14 Thread Jarek Potiuk
I had a lot of interesting discussions last few days with Apache Airflow users at PyDataWarsaw 2019 (I was actually quite surprised how many people use Airflow in Poland). One discussion brought an interesting subject: Packaging dags in wheel format. The users mentioned that they are super-happy us