I'm not in favor of reusing `schedule_interval` for datasets (or folding
`timetable` into it either), primarily because it doesn't feel like a good
name to describe what it does. If we don't have the appetite to
consolidate, my vote is we stick with 3 separate params. So preferably #1,
if not then #2 for me.

I'm less concerned about the "everyone has to update every DAG" issue, as
it only becomes a problem for Airflow 3, folks can freely ignore it until
then, and when they do update it it's a simple find/replace.


On Thu, Aug 4, 2022 at 2:17 PM Andrew Gibbs <a.r.gibbs.massma...@gmail.com>
wrote:

> I'll just +1 that IMO a UI reminder is a great option.
>
> 2.3.0 added an alert when robots.txt was being scanned, which was great,
> because I ignored it for a while, but eventually caved and looked into it.
> This I would see as more of the same. I currently get loads of warnings
> about things in my config file that have moved; but typically only if I
> have to do a `airflow db upgrade` or `airflow dags backfill`. I see the
> alert, think "I must fix those", and then the Homer effect kicks in, and
> something else takes up the space in my brain. Having them as a banner in
> the GUI would 100% get me to sort it sooner or later.
>
> On Thu, Aug 4, 2022 at 9:32 PM Vikram Koka <vik...@astronomer.io.invalid>
> wrote:
>
>> Hi Daniel,
>>
>> I completely agree with wanting to consolidate, but I am very concerned
>> about people needing to change their existing code.
>> As Ping and Felix also said, I worry that people ignore deprecation
>> warnings, especially since the original code authors don't really look at
>> logs or other pipeline artifacts unless something is broken.
>>
>> With respect to the options you outlined, I am personally leaning towards
>> (4) i.e. "don't deprecate timetable, but remove schedule_on (AIP-48 has
>> added it in main) and take List[Dataset] in `schedule_interval`". I could
>> be convinced with (3), since I think that "timetable" is relatively recent
>> and have not seen as much usage for this invocation yet. I am definitely
>> curious about other perspectives on this as well.
>>
>> I have a hard time with option (1) of deprecating "schedule interval",
>> because of the historical nature of it and the code changes that people may
>> have to make to deal with it.
>>
>> Thanks for raising this topic and the discussion,
>>
>> Vikram
>>
>>
>> On Thu, Aug 4, 2022 at 10:24 AM Daniel Standish
>> <daniel.stand...@astronomer.io.invalid> wrote:
>>
>>> I agree with those (including myself) who have suggested that it would
>>> be great to have a deprecation fix tool.
>>>
>>> I also think that UI notifications would be valuable.  Another option
>>> would be to provide a less intrusive indicator somewhere (like when chrome
>>> lets you know you need to update), and then you click on it and it takes
>>> you to a page that lists out all the deprecation warnings.
>>>
>>>
>>> On Thu, Aug 4, 2022 at 1:03 AM Jarek Potiuk <ja...@potiuk.com> wrote:
>>>
>>>> I know we are distracting a bit from the main subject, but I think it's
>>>> worth it.
>>>>
>>>> I think this is a fact of life that people don't remove deprecations.
>>>> But we have a choice of "not doing anything about it" and complaining. Or
>>>> doing something that might improve it.
>>>>
>>>> I usually err on the side of "doing something" when I see a problem. I
>>>> am a doer :).
>>>>
>>>> And I have a possibly controversial proposal. I got the inspiration
>>>> from node.js old versions. Whenever an old version of Node is being used,
>>>> you have a 20 seconds delay artificially added and information that in
>>>> order to remove those 20 seconds you need to upgrade.
>>>>
>>>> Why don't we do that for some really old/serious deprecations ? Maybe
>>>> not 20 seconds. Maybe adding an unremovable RED warning in the UI if you
>>>> have any deprecation unhandled. Something that is annoying enough that
>>>> will cause people to make an action but not intrusive enough to break
>>>> things.
>>>>
>>>> I think this is a very basic psychological trait:
>>>>
>>>> If it does not cost the user anything to leave the deprecations, there
>>>> will be a big number of people who will do that.
>>>> We need to make sure that deprecations are costing our users "a little"
>>>> and that they are aware that they can remove the cost (and know how - this
>>>> is super important that we only show such a message with clear instructions
>>>> on what to do).
>>>>
>>>> Such a message might be annoying in many ways:
>>>>
>>>> * it can obstruct significant part of the screen
>>>> * It can be flashy and had very annoying colours (think bold pink fonts
>>>> on flash-green background for example)
>>>> * It can not allow to do anything before the user closes it with a
>>>> REALLY small 'x' that it is difficult to point at (think cookie banners).
>>>> And when they click they can be greeted with "You know - you can actually
>>>> remove this warning by doing THIS".
>>>> * and we have probably more options - our UX specialists might have
>>>> some good ideas here
>>>>
>>>> For anyone who complains about "people do not act on deprecation." -
>>>> yeah this is true.
>>>> But one universal truth there is that if we won't act, this will change
>>>> nothing in the behaviour of our users - this is 100% certain.
>>>>
>>>> If we do something - it might improve it. We will not get a 100%
>>>> success for sure, but we can at least do something to improve it.
>>>>
>>>> WDYT?
>>>>
>>>> J.
>>>>
>>>>
>>>> On Thu, Aug 4, 2022 at 6:52 AM Felix Uellendall <felue...@pm.me.invalid>
>>>> wrote:
>>>>
>>>>> Hi Ping,
>>>>>
>>>>> I had this idea about auto-fixing deprecation warnings too when I
>>>>> started the airflint project.
>>>>> I totally agree with what you said that often people ignore these
>>>>> warnings and therefore a cli tool would come in very handy. It could look
>>>>> similar to the airflint project and also be based on the python refactor
>>>>> package, but I would keep it out of this discussion for now.
>>>>>
>>>>> Regarding the schedule arg, I don’t have a strong opinion there but
>>>>> also tend to having one schedule arg, because of simplicity for the
>>>>> dag author.
>>>>>
>>>>> Best,
>>>>> Felix
>>>>>
>>>>>
>>>>>
>>>>> Sent from Proton Mail for iOS
>>>>>
>>>>>
>>>>> On Thu, Aug 4, 2022 at 00:23, Ping Zhang <pin...@umich.edu> wrote:
>>>>>
>>>>> Hi Daniel,
>>>>>
>>>>> +1 for ‘schedule' and thanks for bringing this up.
>>>>>
>>>>> I agree with Vikram, we should be very careful about deprecating
>>>>> existing params even if we have warnings around it. Not sure if this is a
>>>>> general case, but I notice that people ignore warnings all the time until
>>>>> it starts to break. Thus I am thinking if airflow can have a cli tool to
>>>>> help users to batch update these deprecating params just like python's
>>>>> futurize tool. This can greatly ease the burden of upgrading the airflow
>>>>> version.
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Ping
>>>>>
>>>>>
>>>>> On Wed, Aug 3, 2022 at 3:13 PM Daniel Standish
>>>>> <daniel.stand...@astronomer.io.invalid> wrote:
>>>>>
>>>>>> In case you forgot about us @Vikram, gentle nudge
>>>>>>
>>>>>> On Mon, Aug 1, 2022 at 8:40 AM Daniel Standish <
>>>>>> daniel.stand...@astronomer.io> wrote:
>>>>>>
>>>>>>> Well, this is a discussion thread, so let's discuss!
>>>>>>>
>>>>>>> Vikram, what kind of implications are you thinking of? Maybe you
>>>>>>> could provide a little more detail about your concerns?
>>>>>>>
>>>>>>> There are certainly alternatives, and of course each of them has
>>>>>>> tradeoffs.
>>>>>>>
>>>>>>> Here are the options I see:
>>>>>>>
>>>>>>> 1. consolidate to schedule (deprecate schedule_interval and
>>>>>>> timetable; the proposal under discussion in this thread)
>>>>>>> 2. do nothing (keep 3 different scheduling params)
>>>>>>> 3. consolidate to schedule_interval (deprecate timetable)
>>>>>>> 4. don't deprecate timetable, but remove schedule_on (AIP-48 has
>>>>>>> added it in main) and take List[Dataset] in `schedule_interval`
>>>>>>>
>>>>>>> Which would you propose we go with?
>>>>>>>
>>>>>>> If we have a veto against 1 (and again, we're in hypothetical
>>>>>>> territory since this is still a discussion thread), my vote would
>>>>>>> definitely be to go with 3.
>>>>>>>
>>>>>>> The param `schedule_interval` as catch all is not *sooo *bad, but
>>>>>>> `schedule` definitely seems better. From a new user's perspective, 
>>>>>>> having
>>>>>>> misleading names is confusing, and can be a negative when looking at
>>>>>>> adoption. And for a tenured user, it can nag at you. And dealing with
>>>>>>> deprecations as an Airflow user, well it's sort of part of the job, 
>>>>>>> though
>>>>>>> it can be bothersome, certainly.
>>>>>>>
>>>>>>> Just to look at / compare feasibility.
>>>>>>>
>>>>>>> Deprecating schedule_interval, it would be more prominent than some
>>>>>>> other deprecations because it's a DAG param. That said, I think it's 
>>>>>>> pretty
>>>>>>> straightforward to handle. For one, it's at the cardinality of DAG and 
>>>>>>> not
>>>>>>> *task*, so there is less to change with than some other
>>>>>>> deprecations. And being kwargs-only, it's an easy find and replace that
>>>>>>> could be automated with an `upgrade-check` script. It's very easy to
>>>>>>> detect, either to warn about, or ultimately, after upgrade, to fail the 
>>>>>>> DAG
>>>>>>> parse (which forces user to fix). Perhaps we could even experiment with 
>>>>>>> a
>>>>>>> deprecation fix tool, so that users could get help fixing deprecations, 
>>>>>>> not
>>>>>>> necessarily in the context of an upgrade, since we're not looking at 3.0
>>>>>>> any time soon.
>>>>>>>
>>>>>>> We could compare with execution_date, if execution_date is to be
>>>>>>> removed in 3.0. I think the deprecation of execution_date will be a more
>>>>>>> difficult thing for users to handle. For one, potentially higher
>>>>>>> cardinality, since it is used in tasks. But also, the use of 
>>>>>>> execution_date
>>>>>>> can be harder to detect, since it can be in templates and elsewhere in
>>>>>>> custom operators. *(With schedule_interval, since it's a top-level
>>>>>>> dag kwarg, it will be easier.)* And execution_date was replaced
>>>>>>> with `logical_date` for similar reasons -- avoiding user confusion.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Jul 29, 2022 at 2:42 PM Vikram Koka
>>>>>>> <vik...@astronomer.io.invalid> wrote:
>>>>>>>
>>>>>>>> -1 from me.
>>>>>>>>
>>>>>>>> Though I agree in principle with the idea of consolidation, I don't
>>>>>>>> think we should be doing this yet until we understand the implications
>>>>>>>> completely.
>>>>>>>> I am really not in favor of deprecation of the existing params,
>>>>>>>> unless there is really no alternative.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Jul 29, 2022 at 2:37 PM Daniel Standish
>>>>>>>> <daniel.stand...@astronomer.io.invalid> wrote:
>>>>>>>>
>>>>>>>>> So far, seems all in favor.
>>>>>>>>>
>>>>>>>>> I will just highlight, in case it's not clear.... when we release
>>>>>>>>> this (presumably 2.4), basically every single dag in existence will 
>>>>>>>>> start
>>>>>>>>> emitting deprecation warnings, and prior to 3.0, basically every dag 
>>>>>>>>> in
>>>>>>>>> existence will need to be updated.
>>>>>>>>>
>>>>>>>>> Thankfully, for most people, this should be an easy update since
>>>>>>>>> DAG is kwargs only, IIRC.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, Jul 29, 2022, 2:27 PM Brent Bovenzi
>>>>>>>>> <br...@astronomer.io.invalid> wrote:
>>>>>>>>>
>>>>>>>>>> +1 to consolidating and "schedule".
>>>>>>>>>>
>>>>>>>>>> On Fri, Jul 29, 2022, 10:12 PM Drew Hubl
>>>>>>>>>> <drew.h...@astronomer.io.invalid> wrote:
>>>>>>>>>>
>>>>>>>>>>> +1 for ‘schedule', and another +1 to the importance of
>>>>>>>>>>> consolidating to one being more important than the name of that one
>>>>>>>>>>>
>>>>>>>>>>> On Jul 29, 2022, at 1:58 PM, Josh Fell <
>>>>>>>>>>> josh.d.f...@astronomer.io.INVALID> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Consolidating to a single scheduling parameter is a big +1 from
>>>>>>>>>>> me. `schedule` seems like a nice catch-all.
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Jul 29, 2022 at 9:10 AM Jed Cunningham <
>>>>>>>>>>> jedcunning...@apache.org> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> +1 on moving to a single `schedule` param.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>

Reply via email to