Re: Airflow DAG Serialisation

2019-07-30 Thread Driesprong, Fokko
Hi Jon, I would argue that that would be the wrong approach. The processing of the DAGs is being moved to one single place, and there they will be parsed in parallel. This is already being done in the scheduler (look for max_threads in the code). As soon as we have the DAG serialized properly, the

Re: [VOTE] Changes in import paths

2019-07-30 Thread Tomasz Urbaszek
Maybe we can put all those AtoB operators under one name like “transfer”, then it would be easier to look for such operator? Best, Tomek On Tue, 30 Jul 2019 at 21:39, Chen Tong wrote: > Daniel mentioned a good point. Such composed operator may also involves > both cloud and non-cloud provider s

Re: [VOTE] Changes in import paths

2019-07-30 Thread Chen Tong
Daniel mentioned a good point. Such composed operator may also involves both cloud and non-cloud provider saying FTPtoS3Operator. Should it in AWS folder? Also, saying in the future, another cloud provider is growing large enough. Will we rename all plugins related to this provider? What's the cri

Re: [VOTE] Changes in import paths

2019-07-30 Thread Daniel Standish
One wrinkle with case 3+4 option D is inter-provider operators. Mainly it's storage I think e.g. XToS3Operator or XToGCSOperator where the X is a service some different provider. Maybe the rule should be to locate the operator according to the first provider referenced. So e.g. s3_to_gcs_transfe

Re: Airflow DAG Serialisation

2019-07-30 Thread Ash Berlin-Taylor
Hi Jon, As part of this AIP(24) we aren't going to touch the scheduler any more than absolutely required, but yes, better support of dynamic DAGs is _very much_ on Kaxil and I's hit list. Our rough approach right now is to design the serialisation format well enough (including versioning it so

Re: Airflow DAG Serialisation

2019-07-30 Thread Jonathan Miles
Another ask for the long-term list. From a superficial read of the code, it looks like this asynchronous DAG loading approach could also be a stepping stone towards loading DAGs in parallel? I've come across a case of someone dynamically generating a DAG based on an external data source. Probl

Re: [VOTE] Changes in import paths

2019-07-30 Thread Kamil Breguła
Yes. All changes will be backwards compatible. In the case of using the old path, a message containing a proposal for change will be reported to the user. I prepared an example of how to change the name of a class in a case with the use of a native solution. Source code: https://github.com/mik-la

Re: Airflow DAG Serialisation

2019-07-30 Thread Ash Berlin-Taylor
The one added complexity in back-porting this to 1.10.x is that we have two webservers (classic and RBAC) so either we only add this feature to the RBAC path for a 1.10.5 (which I am okay with) or someone other than me ports any changes to the classic UI once it's merged to master ;) -ash > On

Re: [VOTE] Changes in import paths

2019-07-30 Thread Ash Berlin-Taylor
Just cos I'm not sure it's _explicitly_ stated, but all of the moves will have a deprecation of the old name right? 3+4 case D gets my vote too. -a > On 30 Jul 2019, at 09:58, Jarek Potiuk wrote: > > I went ahead and updated the page (just to speed it up) as I think it > really makes sense to

Re: [VOTE] Changes in import paths

2019-07-30 Thread Jarek Potiuk
I went ahead and updated the page (just to speed it up) as I think it really makes sense to join those two cases (and I do not see any drawbacks - I think the options we have cover all possible approaches) and we can always go back if we need to. https://cwiki.apache.org/confluence/display/AIRFLOW

Re: Airflow DAG Serialisation

2019-07-30 Thread Jarek Potiuk
I think Zhou's change is pretty much backwards-compatible with 1.10.x - it's basically optimisation that people might find really useful until 2.0 is out. I believe (correct me if I am wrong) it does not require any change from the user's perspective. Airflow will continue to behave the same way as

Re: [VOTE] Changes in import paths

2019-07-30 Thread Jarek Potiuk
I think almost everyone voted and we have almost perfect consensus. We all agree amongst other on moving all operators out of contrib (Great!). The only doubts are for *Case 3* (Cloud provider prefix) and *Case 4* (Using Namespaces). I think there was actually an overlap between those two. Also As