@Vikram, yeah, I understand that concern.  Making a breaking change is
something we shouldn't do carelessly, without consideration, without good
reason.  But I think we have a good reason here.

Deprecations are somewhat regular in airflow.  In 2.0 we moved a lot of
imports.  We also changed the CLI semantics.  Probably there were other
breaking changes that I'm forgetting *(granted sometimes we didn't remove
the backward-compatibility immediately in 2.0, and that remains an option
here)*.  Probably just about every dag in existence had to be changed then
too.  And that's what a major release is for.

In 3.0, it looks like we're deprecating execution_date...  That's even
bigger.  So, were we to do this, (1) moving to `schedule` wouldn't be the
only notable breaking change and (2) by comparison it's a pretty simple
change.  And if, as part of the upgrade to a major release, you have to do
renames and such anyway, the marginal cost of one more isn't massive.

I guess my take is that we can't be chained to our previous decisions
forever.  And we have to have the flexibility to adapt our interfaces so
that they still make sense and are consistent and don't cause excessive
eyebrow wrinkling.  We have adopted semantic versioning to provide a
mechanism for signaling when breaking changes are coming.  And we can add
tooling to help users along the way.

Additionally, we can't *only *look at the burden of such a change for
users. We must also weigh this against the burden that we maintain by
*not* changing, i.e.
by having too many params, or ill-named params.


On Thu, Aug 4, 2022 at 2:57 PM Jed Cunningham <jedcunning...@apache.org>
wrote:

> I'm not in favor of reusing `schedule_interval` for datasets (or folding
> `timetable` into it either), primarily because it doesn't feel like a good
> name to describe what it does. If we don't have the appetite to
> consolidate, my vote is we stick with 3 separate params. So preferably #1,
> if not then #2 for me.
>
> I'm less concerned about the "everyone has to update every DAG" issue, as
> it only becomes a problem for Airflow 3, folks can freely ignore it until
> then, and when they do update it it's a simple find/replace.
>
>
> On Thu, Aug 4, 2022 at 2:17 PM Andrew Gibbs <a.r.gibbs.massma...@gmail.com>
> wrote:
>
>> I'll just +1 that IMO a UI reminder is a great option.
>>
>> 2.3.0 added an alert when robots.txt was being scanned, which was great,
>> because I ignored it for a while, but eventually caved and looked into it.
>> This I would see as more of the same. I currently get loads of warnings
>> about things in my config file that have moved; but typically only if I
>> have to do a `airflow db upgrade` or `airflow dags backfill`. I see the
>> alert, think "I must fix those", and then the Homer effect kicks in, and
>> something else takes up the space in my brain. Having them as a banner in
>> the GUI would 100% get me to sort it sooner or later.
>>
>> On Thu, Aug 4, 2022 at 9:32 PM Vikram Koka <vik...@astronomer.io.invalid>
>> wrote:
>>
>>> Hi Daniel,
>>>
>>> I completely agree with wanting to consolidate, but I am very concerned
>>> about people needing to change their existing code.
>>> As Ping and Felix also said, I worry that people ignore deprecation
>>> warnings, especially since the original code authors don't really look at
>>> logs or other pipeline artifacts unless something is broken.
>>>
>>> With respect to the options you outlined, I am personally leaning
>>> towards (4) i.e. "don't deprecate timetable, but remove schedule_on (AIP-48
>>> has added it in main) and take List[Dataset] in `schedule_interval`". I
>>> could be convinced with (3), since I think that "timetable" is relatively
>>> recent and have not seen as much usage for this invocation yet. I am
>>> definitely curious about other perspectives on this as well.
>>>
>>> I have a hard time with option (1) of deprecating "schedule interval",
>>> because of the historical nature of it and the code changes that people may
>>> have to make to deal with it.
>>>
>>> Thanks for raising this topic and the discussion,
>>>
>>> Vikram
>>>
>>>
>>> On Thu, Aug 4, 2022 at 10:24 AM Daniel Standish
>>> <daniel.stand...@astronomer.io.invalid> wrote:
>>>
>>>> I agree with those (including myself) who have suggested that it would
>>>> be great to have a deprecation fix tool.
>>>>
>>>> I also think that UI notifications would be valuable.  Another option
>>>> would be to provide a less intrusive indicator somewhere (like when chrome
>>>> lets you know you need to update), and then you click on it and it takes
>>>> you to a page that lists out all the deprecation warnings.
>>>>
>>>>
>>>> On Thu, Aug 4, 2022 at 1:03 AM Jarek Potiuk <ja...@potiuk.com> wrote:
>>>>
>>>>> I know we are distracting a bit from the main subject, but I think
>>>>> it's worth it.
>>>>>
>>>>> I think this is a fact of life that people don't remove deprecations.
>>>>> But we have a choice of "not doing anything about it" and complaining. Or
>>>>> doing something that might improve it.
>>>>>
>>>>> I usually err on the side of "doing something" when I see a problem. I
>>>>> am a doer :).
>>>>>
>>>>> And I have a possibly controversial proposal. I got the inspiration
>>>>> from node.js old versions. Whenever an old version of Node is being used,
>>>>> you have a 20 seconds delay artificially added and information that in
>>>>> order to remove those 20 seconds you need to upgrade.
>>>>>
>>>>> Why don't we do that for some really old/serious deprecations ? Maybe
>>>>> not 20 seconds. Maybe adding an unremovable RED warning in the UI if you
>>>>> have any deprecation unhandled. Something that is annoying enough that
>>>>> will cause people to make an action but not intrusive enough to break
>>>>> things.
>>>>>
>>>>> I think this is a very basic psychological trait:
>>>>>
>>>>> If it does not cost the user anything to leave the deprecations, there
>>>>> will be a big number of people who will do that.
>>>>> We need to make sure that deprecations are costing our users "a
>>>>> little" and that they are aware that they can remove the cost (and know 
>>>>> how
>>>>> - this is super important that we only show such a message with clear
>>>>> instructions on what to do).
>>>>>
>>>>> Such a message might be annoying in many ways:
>>>>>
>>>>> * it can obstruct significant part of the screen
>>>>> * It can be flashy and had very annoying colours (think bold pink
>>>>> fonts on flash-green background for example)
>>>>> * It can not allow to do anything before the user closes it with a
>>>>> REALLY small 'x' that it is difficult to point at (think cookie banners).
>>>>> And when they click they can be greeted with "You know - you can actually
>>>>> remove this warning by doing THIS".
>>>>> * and we have probably more options - our UX specialists might have
>>>>> some good ideas here
>>>>>
>>>>> For anyone who complains about "people do not act on deprecation." -
>>>>> yeah this is true.
>>>>> But one universal truth there is that if we won't act, this will
>>>>> change nothing in the behaviour of our users - this is 100% certain.
>>>>>
>>>>> If we do something - it might improve it. We will not get a 100%
>>>>> success for sure, but we can at least do something to improve it.
>>>>>
>>>>> WDYT?
>>>>>
>>>>> J.
>>>>>
>>>>>
>>>>> On Thu, Aug 4, 2022 at 6:52 AM Felix Uellendall <felue...@pm.me.invalid>
>>>>> wrote:
>>>>>
>>>>>> Hi Ping,
>>>>>>
>>>>>> I had this idea about auto-fixing deprecation warnings too when I
>>>>>> started the airflint project.
>>>>>> I totally agree with what you said that often people ignore these
>>>>>> warnings and therefore a cli tool would come in very handy. It could look
>>>>>> similar to the airflint project and also be based on the python refactor
>>>>>> package, but I would keep it out of this discussion for now.
>>>>>>
>>>>>> Regarding the schedule arg, I don’t have a strong opinion there but
>>>>>> also tend to having one schedule arg, because of simplicity for the
>>>>>> dag author.
>>>>>>
>>>>>> Best,
>>>>>> Felix
>>>>>>
>>>>>>
>>>>>>
>>>>>> Sent from Proton Mail for iOS
>>>>>>
>>>>>>
>>>>>> On Thu, Aug 4, 2022 at 00:23, Ping Zhang <pin...@umich.edu> wrote:
>>>>>>
>>>>>> Hi Daniel,
>>>>>>
>>>>>> +1 for ‘schedule' and thanks for bringing this up.
>>>>>>
>>>>>> I agree with Vikram, we should be very careful about deprecating
>>>>>> existing params even if we have warnings around it. Not sure if this is a
>>>>>> general case, but I notice that people ignore warnings all the time until
>>>>>> it starts to break. Thus I am thinking if airflow can have a cli tool to
>>>>>> help users to batch update these deprecating params just like python's
>>>>>> futurize tool. This can greatly ease the burden of upgrading the airflow
>>>>>> version.
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Ping
>>>>>>
>>>>>>
>>>>>> On Wed, Aug 3, 2022 at 3:13 PM Daniel Standish
>>>>>> <daniel.stand...@astronomer.io.invalid> wrote:
>>>>>>
>>>>>>> In case you forgot about us @Vikram, gentle nudge
>>>>>>>
>>>>>>> On Mon, Aug 1, 2022 at 8:40 AM Daniel Standish <
>>>>>>> daniel.stand...@astronomer.io> wrote:
>>>>>>>
>>>>>>>> Well, this is a discussion thread, so let's discuss!
>>>>>>>>
>>>>>>>> Vikram, what kind of implications are you thinking of? Maybe you
>>>>>>>> could provide a little more detail about your concerns?
>>>>>>>>
>>>>>>>> There are certainly alternatives, and of course each of them has
>>>>>>>> tradeoffs.
>>>>>>>>
>>>>>>>> Here are the options I see:
>>>>>>>>
>>>>>>>> 1. consolidate to schedule (deprecate schedule_interval and
>>>>>>>> timetable; the proposal under discussion in this thread)
>>>>>>>> 2. do nothing (keep 3 different scheduling params)
>>>>>>>> 3. consolidate to schedule_interval (deprecate timetable)
>>>>>>>> 4. don't deprecate timetable, but remove schedule_on (AIP-48 has
>>>>>>>> added it in main) and take List[Dataset] in `schedule_interval`
>>>>>>>>
>>>>>>>> Which would you propose we go with?
>>>>>>>>
>>>>>>>> If we have a veto against 1 (and again, we're in hypothetical
>>>>>>>> territory since this is still a discussion thread), my vote would
>>>>>>>> definitely be to go with 3.
>>>>>>>>
>>>>>>>> The param `schedule_interval` as catch all is not *sooo *bad, but
>>>>>>>> `schedule` definitely seems better. From a new user's perspective, 
>>>>>>>> having
>>>>>>>> misleading names is confusing, and can be a negative when looking at
>>>>>>>> adoption. And for a tenured user, it can nag at you. And dealing with
>>>>>>>> deprecations as an Airflow user, well it's sort of part of the job, 
>>>>>>>> though
>>>>>>>> it can be bothersome, certainly.
>>>>>>>>
>>>>>>>> Just to look at / compare feasibility.
>>>>>>>>
>>>>>>>> Deprecating schedule_interval, it would be more prominent than some
>>>>>>>> other deprecations because it's a DAG param. That said, I think it's 
>>>>>>>> pretty
>>>>>>>> straightforward to handle. For one, it's at the cardinality of DAG and 
>>>>>>>> not
>>>>>>>> *task*, so there is less to change with than some other
>>>>>>>> deprecations. And being kwargs-only, it's an easy find and replace that
>>>>>>>> could be automated with an `upgrade-check` script. It's very easy to
>>>>>>>> detect, either to warn about, or ultimately, after upgrade, to fail 
>>>>>>>> the DAG
>>>>>>>> parse (which forces user to fix). Perhaps we could even experiment 
>>>>>>>> with a
>>>>>>>> deprecation fix tool, so that users could get help fixing 
>>>>>>>> deprecations, not
>>>>>>>> necessarily in the context of an upgrade, since we're not looking at 
>>>>>>>> 3.0
>>>>>>>> any time soon.
>>>>>>>>
>>>>>>>> We could compare with execution_date, if execution_date is to be
>>>>>>>> removed in 3.0. I think the deprecation of execution_date will be a 
>>>>>>>> more
>>>>>>>> difficult thing for users to handle. For one, potentially higher
>>>>>>>> cardinality, since it is used in tasks. But also, the use of 
>>>>>>>> execution_date
>>>>>>>> can be harder to detect, since it can be in templates and elsewhere in
>>>>>>>> custom operators. *(With schedule_interval, since it's a top-level
>>>>>>>> dag kwarg, it will be easier.)* And execution_date was replaced
>>>>>>>> with `logical_date` for similar reasons -- avoiding user confusion.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Jul 29, 2022 at 2:42 PM Vikram Koka
>>>>>>>> <vik...@astronomer.io.invalid> wrote:
>>>>>>>>
>>>>>>>>> -1 from me.
>>>>>>>>>
>>>>>>>>> Though I agree in principle with the idea of consolidation, I
>>>>>>>>> don't think we should be doing this yet until we understand the
>>>>>>>>> implications completely.
>>>>>>>>> I am really not in favor of deprecation of the existing params,
>>>>>>>>> unless there is really no alternative.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, Jul 29, 2022 at 2:37 PM Daniel Standish
>>>>>>>>> <daniel.stand...@astronomer.io.invalid> wrote:
>>>>>>>>>
>>>>>>>>>> So far, seems all in favor.
>>>>>>>>>>
>>>>>>>>>> I will just highlight, in case it's not clear.... when we release
>>>>>>>>>> this (presumably 2.4), basically every single dag in existence will 
>>>>>>>>>> start
>>>>>>>>>> emitting deprecation warnings, and prior to 3.0, basically every dag 
>>>>>>>>>> in
>>>>>>>>>> existence will need to be updated.
>>>>>>>>>>
>>>>>>>>>> Thankfully, for most people, this should be an easy update since
>>>>>>>>>> DAG is kwargs only, IIRC.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Fri, Jul 29, 2022, 2:27 PM Brent Bovenzi
>>>>>>>>>> <br...@astronomer.io.invalid> wrote:
>>>>>>>>>>
>>>>>>>>>>> +1 to consolidating and "schedule".
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Jul 29, 2022, 10:12 PM Drew Hubl
>>>>>>>>>>> <drew.h...@astronomer.io.invalid> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> +1 for ‘schedule', and another +1 to the importance of
>>>>>>>>>>>> consolidating to one being more important than the name of that one
>>>>>>>>>>>>
>>>>>>>>>>>> On Jul 29, 2022, at 1:58 PM, Josh Fell <
>>>>>>>>>>>> josh.d.f...@astronomer.io.INVALID> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Consolidating to a single scheduling parameter is a big +1 from
>>>>>>>>>>>> me. `schedule` seems like a nice catch-all.
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Jul 29, 2022 at 9:10 AM Jed Cunningham <
>>>>>>>>>>>> jedcunning...@apache.org> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> +1 on moving to a single `schedule` param.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>

Reply via email to