Re: [AIP-35] Add Signal Based Scheduling To Airflow

2020-06-16 Thread Chris Palmer
Nicholas, Are you saying that you actually have tasks in Airflow that are intended to run indefinitely? That in of itself seems to be a huge fundamental departure from many of the assumptions built into Airflow. Chris On Tue, Jun 16, 2020 at 12:00 PM Gerard Casas Saez wrote: > That looks inte

Re: [AIP-34] Rewrite SubDagOperator

2020-06-12 Thread Chris Palmer
I agree that SubDAGs are an overly complex abstraction. I think what is needed/useful is a TaskGroup concept. On a high level I think you want this functionality: - Tasks can be added to a TaskGroup - You *can* have dependencies between Tasks in the same TaskGroup, but *cannot* have depen

Re: Stateful Tasks (was Poke Reschedule)

2020-01-14 Thread Chris Palmer
I think some of the discussions about incremental and/or idempotency are confusing the topic and are a distraction from the real question. As I said in my previous reply on this thread, many tasks utilize state that is kept somewhere in order to achieve idempotency in an efficient way. Whether that

Re: [Discussion] In Prep for AIP: Stateful XComs and Poke Rescheduling in Operators

2020-01-10 Thread Chris Palmer
I agree with Jarek that maintaining state between retries is not the right thing to do. To be honest I'm not even convinced by the need for state between reschedules myself. While I know from past experience that FTP is a pain to deal with, I think that your example is a pretty niche one. Addition

Re: AIP-21 - grouping google operators

2019-10-04 Thread Chris Palmer
> tables because they have a very large number of services. If Oracle > will have many integrations, it is worth emphasizing this fact and > moving these integrations to a separate place, so that it is easier to > find them. and use. Keeping all possible files in one place makes it &

Re: AIP-21 - grouping google operators

2019-10-04 Thread Chris Palmer
This seems unnecessary to me. Is everything going to be under some 'provider' or just certain sets of operators, and if so what differentiates when something should be under a provider or not? For example, are the mysql operators going to go under 'provider/oracle/'? Chris On Fri, Oct 4, 2019 at

Re: AIP-21 (Move operators to Core) - "cross_transfer" packages

2019-10-04 Thread Chris Palmer
ision. > > > > If no-one objects (Lazy Consensus > > <https://community.apache.org/committers/lazyConsensus.html>) till > > Monday7th of October, 3.20 CEST, we will update AIP-21 with information > > that transfer operators should be placed in the "source" p

Re: AIP-21 (Move operators to Core) - "cross_transfer" packages

2019-09-23 Thread Chris Palmer
On Mon, Sep 23, 2019 at 1:22 PM Kamil BreguĊ‚a wrote: > On Mon, Sep 23, 2019 at 7:04 PM Chris Palmer wrote: > > > > Is there a reason why we can't use symlinks to have copies of the files > > show up in both subpackages? So that `gcs_to_s3.py` would be under both >

Re: AIP-21 (Move operators to Core) - "cross_transfer" packages

2019-09-23 Thread Chris Palmer
Is there a reason why we can't use symlinks to have copies of the files show up in both subpackages? So that `gcs_to_s3.py` would be under both `aws/operators/` and `gcp/operators`. I could imagine there may be technical reasons why this is a bad idea, but just thought I would ask. If that is not

Re: Manipulating the DAG Code View?

2019-08-13 Thread Chris Palmer
A more involved PR like you suggest might be valuable in the long run, but in the short term I've been successful in the past with simply modifying the fileloc attribute of DAGs. It get's set here to the previous frame, but

Re: CLI: Use nested commands instead of flags

2019-06-19 Thread Chris Palmer
I think it can make sense to use both singular and plural, if you structure your commands in the right way. For example Kubernetes generally use the structure of: kubectl where target can be singular or plural depending on what you want to take action on. This is opposed to GCP which use the g

Tasks that run just once

2019-05-13 Thread Chris Palmer
I'm trying to design a set of DAGs to do a one create and backfill of a set of tables in BigQuery and then perform periodic loads into those tables. I can't quite get it to work the way I want to and I'm wondering if other people have solved similar problems. The parameters are as follows: I have