Re: [VOTE] AIP-6 Apply Pylint to Airflow

2019-04-15 Thread Daniel Imberman
+1 (binding) On Mon, Apr 15, 2019 at 12:29 PM Maxime Beauchemin < maximebeauche...@gmail.com> wrote: > pylint and black are super solid, no questions there afaic > > Max > > On Mon, Apr 15, 2019 at 11:48 AM m...@maximilianroos.com < > m...@maximilianroos.com> > wrote: > > > Hi there, > > > > I

Re: [VOTE] AIP-6 Apply Pylint to Airflow

2019-04-15 Thread Maxime Beauchemin
pylint and black are super solid, no questions there afaic Max On Mon, Apr 15, 2019 at 11:48 AM m...@maximilianroos.com wrote: > Hi there, > > I haven't been active in the airflow community so this should be weighed > appropriately. > > I'm a core dev of a couple of other libraries (xarray,

Re: [VOTE] AIP-6 Apply Pylint to Airflow

2019-04-15 Thread m
Hi there, I haven't been active in the airflow community so this should be weighed appropriately. I'm a core dev of a couple of other libraries (xarray, pandas-gbq) and have led initiatives to clean up the code in both of those. While I'm very enthusiastic about auto-formatting tools, my

Re: Airflow 1.10.3 has been released!

2019-04-15 Thread Daniel Imberman
Woo! Thanks Ash! On Mon, Apr 15, 2019 at 8:25 AM Kamil Gałuszka wrote: > Thanks Ash for this release and helping with my PR! > > Kamil > > > On Sat, Apr 13, 2019 at 10:43 PM Kamil Breguła > wrote: > > > I am impressed by your hard work. I know that cherry-pick my doc changes > > was not a

Re: Portland Apache Airflow Meetup - Call for speakers

2019-04-15 Thread Rafael Cavazin
Hello there Where will this be located? Can you share the agenda? With a smile, [image: Coolblue] Rafael Cavazin Team Lead Tech Development • Twitter • LinkedIn Weena 664, 3012 CN Rotterdam

Portland Apache Airflow Meetup - Call for speakers

2019-04-15 Thread Danny Gene Duncan
Hello all, We have an upcoming meeting on April 25 and we're opening it up for speakers. So far myself and Ben Tallman will present but we'd also love to hear how others are using airflow. The format will be a 15 minute talk with 5 minutes for Q We'd like to get at least one other person to

Re: Longer term Airflow planning

2019-04-15 Thread James Meickle
Now that this thread has been open for a minute, I'll loop back to this with my own thoughts... I'm not complaining about the overall pace of development. Instead I'd just like us to discuss how and why that pace is concentrated in bugfixes, UI tweaks, new operators, and so on. Check out the

Re: Airflow 1.10.3 has been released!

2019-04-15 Thread Kamil Gałuszka
Thanks Ash for this release and helping with my PR! Kamil On Sat, Apr 13, 2019 at 10:43 PM Kamil Breguła wrote: > I am impressed by your hard work. I know that cherry-pick my doc changes > was not a simple process. :-) > > Thanks, Ash :-) > > On Sat, Apr 13, 2019 at 12:03 PM Felix Uellendall

Re: [DISCUSS] period_start/period_end instead of execution_date/next_execution_date

2019-04-15 Thread Dan Davydov
You could start a [VOTE][PMC ONLY] thread on this topic ( https://www.apache.org/foundation/voting.html). Not sure if that's the best Apache way of doing things, but seems fine to me. My PMC vote personally would maybe be to switch the semantics to the opposite of what they are now without having

Re: [DISCUSS] period_start/period_end instead of execution_date/next_execution_date

2019-04-15 Thread James Meickle
Personally I would be very interested in working on a flexible schedule window/window projection patch. But it would be a big undertaking so it doesn't make sense to start it unless there's a lot of community buy-in to the idea that we aren't just for day-after ETL systems. On Mon, Apr 15, 2019

Re: [2.0 spring cleaning] Deprecate adding Operators and Hooks via plugins?

2019-04-15 Thread James Meickle
The way I'm thinking about this issue is like this: - What features can plugins currently provide - What non-plugin features in Airflow are currently hard to extend in a way that would benefit from pluggability - Out of the union of that feature set: which of them can be easily done via Python

Re: [DISCUSS] period_start/period_end instead of execution_date/next_execution_date

2019-04-15 Thread airflowuser
To quote my user-experience professor from ages ago: "If too many people misuse something you wrote it means that YOU are doing something wrong". Something can be well documented but if it's not intuitive it's likely that people will get it wrong. Say someone ask "When did you execute the

Re: [DISCUSS] period_start/period_end instead of execution_date/next_execution_date

2019-04-15 Thread Dan Davydov
I think if the mission of Airflow is to be a generic Workflow engine, the current semantics of execution date aren't a good default. This might be an unpopular opinion given past threads on this topic :). The execution_date = end_date semantics make sense for the ETL use case but not for other

Re: [2.0 spring cleaning] Remove the EMR connection type.

2019-04-15 Thread Ash Berlin-Taylor
Variable is either all encrypted as a single blob or all plain. I think changing to use a single EMR connection type and require the key/secret/role info to be in that connection makes the most sense. -a > On 15 Apr 2019, at 12:03, Daniel Mateus Pires wrote: > > In our company we use EMR

Re: [2.0 spring cleaning] Deprecate subdags

2019-04-15 Thread Dan Davydov
I don't think fixing subdags to run in the scheduler is enough, although it's a huge improvement over the current implementation (especially the part that lets Subdags specify custom executors). From my experience with Subdags, I think what makes more sense is adding various operators to allow

Re: [2.0 spring cleaning] Remove the EMR connection type.

2019-04-15 Thread Daniel Mateus Pires
In our company we use EMR based operators a lot and it's always been confusing for new users to find the different kinds of EMR clusters as "Connections". Not sure you could just remove the aws_conn_id, because the emr_conn_id doesn't define which AWS account, which region, which profile to use

Re: [2.0 spring cleaning] Remove the EMR connection type.

2019-04-15 Thread Ash Berlin-Taylor
Or we should remove the aws_conn_id from the Emr* (hook and op) rather than passing in two connection types. Anyone have a though as to which way to go? > On 15 Apr 2019, at 11:51, Ash Berlin-Taylor wrote: > > We have an EMR connection type, but the operator actually uses this as a > config

[2.0 spring cleaning] Remove the EMR connection type.

2019-04-15 Thread Ash Berlin-Taylor
We have an EMR connection type, but the operator actually uses this as a config value, and the actual credentials come form the default aws_conn_id: def __init__( self, aws_conn_id='s3_default', emr_conn_id='emr_default',

Re: [2.0 spring cleaning] Deprecate contrib folder?

2019-04-15 Thread airflowuser
Maybe a side issue but... Why must specify each operator in it's own import? Why can't we just something like do from airflow import operators,hook and all operators & hooks will become available. We don't have that many and even looking forward to the future it's unlikely that it will grow by

Re: [2.0 spring cleaning] Deprecate contrib folder?

2019-04-15 Thread Julian De Ruiter
The downside of keeping everything in one gigantic codebase is that it also becomes a monster in terms of dependencies and testing - something that Airflow is already experiencing issues with. This is exactly also why I initially had troubles contributing to Airflow, as tests are near