I would be very much in favour of moving to the naming approach you propose 
(conn_id for hooks, src_conn_id and dest_conn_id for operators with multiple 
connections), which I think is much more consistent than the current naming 
conventions. The added advantage of this naming is also that it makes it much 
easier in the future to work towards more generic operators/hooks where we (for 
example) copy from one database or file system without caring which file 
systems are involved. This avoids the wildfire of the current Airflow codebase 
in which we end up with an operator for every different combination of file 
systems. 

Breaking changes could be maintained via deprecation warnings for a number of 
releases to avoid deterring users, whilst pushing towards a cleaner interface.

On 2018/05/30 01:19:37, "Daniel (Daniel Lamblin) [BDP - Seoul]" 
<lamb...@coupang.com> wrote: 
> The short of this email is: can we please name all the connection id named 
> parameters to all hooks and operators as similarly as possible. EG just 
> `conn_id`?
> 
> So, when we started using Airflow I _had_ thought that minor versions would 
> be compatible for a user's DAG, assuming no use of anything marked beta or 
> deprecated, such that v1.7.1, 1.8.0, 1.8.1, 1.8.2 and 1.9.0 etc would all run 
> dags from prior versions in that linage, each with possible stability and 
> feature improvements and each with possibly more operators, hooks, executors 
> (even) etc.
> 
> This is (now) obviously not the case, and it's the group's choice about what 
> should and what should not break on a release-by-release basis. I think a 
> clear policy would be appropriate for full Apache status, but then I may have 
> missed where the policy is defined.  Though, if defined as not giving 
> stability to the user's dags for most version changes isn't in my opinion 
> going to grow confidence for Airflow being something you can grow with.
> 
> Not to be overly dramatic, but currently the tiny change that the `s3Hook(…)` 
> takes `aws_conn_id='any_string'` vs `s3_conn_id='still_any_string'` means 
> that basically I have to maintain a 1.8.2 setup in perpetuity, because no one 
> (here) wants to briefly code freeze before during and after an update so that 
> we can update all the uses and be ready to roll back the update if something 
> else breaks (also some people access the not-actually-private `_a_key and 
> _s_key` and would need to switch to still-not-private 
> `_get_credentials()[…]`). Instead we'll just run a second airflow setup at 
> 1.9.0 (in each and every staged environment) and move the 8k dags piecemeal 
> whe[never] people get the spare time and inclination. I mean, we might. It 
> looks like 1.10.0 changes some config around s3 logging one more time… so 
> maybe it's better to skip it?
> 
> I saw the couple of PRs where the project itself had to make the changes to 
> usages of the named field. There was momentary and passing concern that 
> users' dags would need to do the same. In the PRs, of the options discussed 
> (shims, supporting the deprecated name as deprecated, doing a hard rename), 
> it wasn't brought up if the rename to aws_conn_id was the best name. [Was 
> this discussed elsewhere?]
> 
> And so I wondered why is there this virtual Hungarian notation on all the 
> `conn_id`s?
> A hook generally takes one `conn_id`, most operators take only one. In these 
> cases couldn't the named parameter have been `conn_id`? When an operator 
> needs a couple conn_ids, could it not have `src_conn_id` and `dest_conn_id` 
> instead of locking in either s3 or aws, mysql or maria, dataflow or beam etc? 
> Those are hypotheticals, I believe.
> 
> Could I propose to plan to break absolutely all uses of `{named}_conn_id`s, 
> before or by version 2.0, with an eye towards never again having to break a 
> named parameter for the rest of 2.x's life? There's probably more named 
> parameters that should be fixed per major release, and if you agree, some 
> work should be done to identify them all, but this just happens to be the one 
> that seems most likely to change often, and has so recently.
> 
> Thanks,
> -Daniel
> 
> 

Reply via email to