One question: are we deprecating the existing parameters straight away or 
leaving it without a warning for one release? (I think someone talked about 
that at one point)

On 6 August 2022 19:46:02 BST, Daniel Standish 
<daniel.stand...@astronomer.io.INVALID> wrote:
>Well, thanks for forcing us to make the case :)
>
>I'll initiate a lazy consensus "vote"
>
>
>On Fri, Aug 5, 2022 at 1:01 PM Vikram Koka <vik...@astronomer.io.invalid>
>wrote:
>
>> Fair points all.
>> I withdraw my objection and agree with the consolidation proposal.
>>
>> Vikram
>>
>>
>> On Fri, Aug 5, 2022 at 10:06 AM Jarek Potiuk <ja...@potiuk.com> wrote:
>>
>>> Let me also add to why I think we should consolidate to one param::
>>>
>>> * "schedule_interval"
>>> * "schedule_on"
>>> * "timetable"
>>>
>>> Imagine you are a new user coming to Airflow.
>>>
>>> Do you immediately recognise which one is which ? Do you see what the
>>> difference is?
>>>
>>> * "Schedule_on" = "Every Sunday" is equally good as "schedule_on"
>>> "dataset".
>>> * "timetable" - what even is a timetable? Is it a "cron TAB" ? Or what?
>>>
>>> For me even if we wanted to have different names, the current set of
>>> names for those params is just terribly ambiguous.
>>>
>>> Making it "schedule" and acting depending on what you pass to it is so
>>> much better and simpler.
>>>
>>> J.
>>>
>>>
>>> On Fri, Aug 5, 2022 at 6:57 PM Daniel Standish
>>> <daniel.stand...@astronomer.io.invalid> wrote:
>>>
>>>> @Vikram, yeah, I understand that concern.  Making a breaking change is
>>>> something we shouldn't do carelessly, without consideration, without good
>>>> reason.  But I think we have a good reason here.
>>>>
>>>> Deprecations are somewhat regular in airflow.  In 2.0 we moved a lot of
>>>> imports.  We also changed the CLI semantics.  Probably there were other
>>>> breaking changes that I'm forgetting *(granted sometimes we didn't
>>>> remove the backward-compatibility immediately in 2.0, and that remains an
>>>> option here)*.  Probably just about every dag in existence had to be
>>>> changed then too.  And that's what a major release is for.
>>>>
>>>> In 3.0, it looks like we're deprecating execution_date...  That's even
>>>> bigger.  So, were we to do this, (1) moving to `schedule` wouldn't be the
>>>> only notable breaking change and (2) by comparison it's a pretty simple
>>>> change.  And if, as part of the upgrade to a major release, you have to do
>>>> renames and such anyway, the marginal cost of one more isn't massive.
>>>>
>>>> I guess my take is that we can't be chained to our previous decisions
>>>> forever.  And we have to have the flexibility to adapt our interfaces so
>>>> that they still make sense and are consistent and don't cause excessive
>>>> eyebrow wrinkling.  We have adopted semantic versioning to provide a
>>>> mechanism for signaling when breaking changes are coming.  And we can add
>>>> tooling to help users along the way.
>>>>
>>>> Additionally, we can't *only *look at the burden of such a change for
>>>> users. We must also weigh this against the burden that we maintain by
>>>> *not* changing, i.e. by having too many params, or ill-named params.
>>>>
>>>>
>>>> On Thu, Aug 4, 2022 at 2:57 PM Jed Cunningham <jedcunning...@apache.org>
>>>> wrote:
>>>>
>>>>> I'm not in favor of reusing `schedule_interval` for datasets (or
>>>>> folding `timetable` into it either), primarily because it doesn't feel 
>>>>> like
>>>>> a good name to describe what it does. If we don't have the appetite to
>>>>> consolidate, my vote is we stick with 3 separate params. So preferably #1,
>>>>> if not then #2 for me.
>>>>>
>>>>> I'm less concerned about the "everyone has to update every DAG" issue,
>>>>> as it only becomes a problem for Airflow 3, folks can freely ignore it
>>>>> until then, and when they do update it it's a simple find/replace.
>>>>>
>>>>>
>>>>> On Thu, Aug 4, 2022 at 2:17 PM Andrew Gibbs <
>>>>> a.r.gibbs.massma...@gmail.com> wrote:
>>>>>
>>>>>> I'll just +1 that IMO a UI reminder is a great option.
>>>>>>
>>>>>> 2.3.0 added an alert when robots.txt was being scanned, which was
>>>>>> great, because I ignored it for a while, but eventually caved and looked
>>>>>> into it. This I would see as more of the same. I currently get loads of
>>>>>> warnings about things in my config file that have moved; but typically 
>>>>>> only
>>>>>> if I have to do a `airflow db upgrade` or `airflow dags backfill`. I see
>>>>>> the alert, think "I must fix those", and then the Homer effect kicks in,
>>>>>> and something else takes up the space in my brain. Having them as a 
>>>>>> banner
>>>>>> in the GUI would 100% get me to sort it sooner or later.
>>>>>>
>>>>>> On Thu, Aug 4, 2022 at 9:32 PM Vikram Koka
>>>>>> <vik...@astronomer.io.invalid> wrote:
>>>>>>
>>>>>>> Hi Daniel,
>>>>>>>
>>>>>>> I completely agree with wanting to consolidate, but I am very
>>>>>>> concerned about people needing to change their existing code.
>>>>>>> As Ping and Felix also said, I worry that people ignore deprecation
>>>>>>> warnings, especially since the original code authors don't really look 
>>>>>>> at
>>>>>>> logs or other pipeline artifacts unless something is broken.
>>>>>>>
>>>>>>> With respect to the options you outlined, I am personally leaning
>>>>>>> towards (4) i.e. "don't deprecate timetable, but remove schedule_on 
>>>>>>> (AIP-48
>>>>>>> has added it in main) and take List[Dataset] in `schedule_interval`". I
>>>>>>> could be convinced with (3), since I think that "timetable" is 
>>>>>>> relatively
>>>>>>> recent and have not seen as much usage for this invocation yet. I am
>>>>>>> definitely curious about other perspectives on this as well.
>>>>>>>
>>>>>>> I have a hard time with option (1) of deprecating "schedule
>>>>>>> interval", because of the historical nature of it and the code changes 
>>>>>>> that
>>>>>>> people may have to make to deal with it.
>>>>>>>
>>>>>>> Thanks for raising this topic and the discussion,
>>>>>>>
>>>>>>> Vikram
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Aug 4, 2022 at 10:24 AM Daniel Standish
>>>>>>> <daniel.stand...@astronomer.io.invalid> wrote:
>>>>>>>
>>>>>>>> I agree with those (including myself) who have suggested that it
>>>>>>>> would be great to have a deprecation fix tool.
>>>>>>>>
>>>>>>>> I also think that UI notifications would be valuable.  Another
>>>>>>>> option would be to provide a less intrusive indicator somewhere (like 
>>>>>>>> when
>>>>>>>> chrome lets you know you need to update), and then you click on it and 
>>>>>>>> it
>>>>>>>> takes you to a page that lists out all the deprecation warnings.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Aug 4, 2022 at 1:03 AM Jarek Potiuk <ja...@potiuk.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> I know we are distracting a bit from the main subject, but I think
>>>>>>>>> it's worth it.
>>>>>>>>>
>>>>>>>>> I think this is a fact of life that people don't remove
>>>>>>>>> deprecations. But we have a choice of "not doing anything about it" 
>>>>>>>>> and
>>>>>>>>> complaining. Or doing something that might improve it.
>>>>>>>>>
>>>>>>>>> I usually err on the side of "doing something" when I see a
>>>>>>>>> problem. I am a doer :).
>>>>>>>>>
>>>>>>>>> And I have a possibly controversial proposal. I got the inspiration
>>>>>>>>> from node.js old versions. Whenever an old version of Node is being 
>>>>>>>>> used,
>>>>>>>>> you have a 20 seconds delay artificially added and information that in
>>>>>>>>> order to remove those 20 seconds you need to upgrade.
>>>>>>>>>
>>>>>>>>> Why don't we do that for some really old/serious deprecations ?
>>>>>>>>> Maybe not 20 seconds. Maybe adding an unremovable RED warning in the 
>>>>>>>>> UI if
>>>>>>>>> you have any deprecation unhandled. Something that is annoying enough 
>>>>>>>>> that
>>>>>>>>> will cause people to make an action but not intrusive enough to break
>>>>>>>>> things.
>>>>>>>>>
>>>>>>>>> I think this is a very basic psychological trait:
>>>>>>>>>
>>>>>>>>> If it does not cost the user anything to leave the deprecations,
>>>>>>>>> there will be a big number of people who will do that.
>>>>>>>>> We need to make sure that deprecations are costing our users "a
>>>>>>>>> little" and that they are aware that they can remove the cost (and 
>>>>>>>>> know how
>>>>>>>>> - this is super important that we only show such a message with clear
>>>>>>>>> instructions on what to do).
>>>>>>>>>
>>>>>>>>> Such a message might be annoying in many ways:
>>>>>>>>>
>>>>>>>>> * it can obstruct significant part of the screen
>>>>>>>>> * It can be flashy and had very annoying colours (think bold pink
>>>>>>>>> fonts on flash-green background for example)
>>>>>>>>> * It can not allow to do anything before the user closes it with a
>>>>>>>>> REALLY small 'x' that it is difficult to point at (think cookie 
>>>>>>>>> banners).
>>>>>>>>> And when they click they can be greeted with "You know - you can 
>>>>>>>>> actually
>>>>>>>>> remove this warning by doing THIS".
>>>>>>>>> * and we have probably more options - our UX specialists might have
>>>>>>>>> some good ideas here
>>>>>>>>>
>>>>>>>>> For anyone who complains about "people do not act on deprecation."
>>>>>>>>> - yeah this is true.
>>>>>>>>> But one universal truth there is that if we won't act, this will
>>>>>>>>> change nothing in the behaviour of our users - this is 100% certain.
>>>>>>>>>
>>>>>>>>> If we do something - it might improve it. We will not get a 100%
>>>>>>>>> success for sure, but we can at least do something to improve it.
>>>>>>>>>
>>>>>>>>> WDYT?
>>>>>>>>>
>>>>>>>>> J.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Aug 4, 2022 at 6:52 AM Felix Uellendall
>>>>>>>>> <felue...@pm.me.invalid> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Ping,
>>>>>>>>>>
>>>>>>>>>> I had this idea about auto-fixing deprecation warnings too when I
>>>>>>>>>> started the airflint project.
>>>>>>>>>> I totally agree with what you said that often people ignore these
>>>>>>>>>> warnings and therefore a cli tool would come in very handy. It could 
>>>>>>>>>> look
>>>>>>>>>> similar to the airflint project and also be based on the python 
>>>>>>>>>> refactor
>>>>>>>>>> package, but I would keep it out of this discussion for now.
>>>>>>>>>>
>>>>>>>>>> Regarding the schedule arg, I don’t have a strong opinion there
>>>>>>>>>> but also tend to having one schedule arg, because of simplicity
>>>>>>>>>> for the dag author.
>>>>>>>>>>
>>>>>>>>>> Best,
>>>>>>>>>> Felix
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Sent from Proton Mail for iOS
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Thu, Aug 4, 2022 at 00:23, Ping Zhang <pin...@umich.edu> wrote:
>>>>>>>>>>
>>>>>>>>>> Hi Daniel,
>>>>>>>>>>
>>>>>>>>>> +1 for ‘schedule' and thanks for bringing this up.
>>>>>>>>>>
>>>>>>>>>> I agree with Vikram, we should be very careful about deprecating
>>>>>>>>>> existing params even if we have warnings around it. Not sure if this 
>>>>>>>>>> is a
>>>>>>>>>> general case, but I notice that people ignore warnings all the time 
>>>>>>>>>> until
>>>>>>>>>> it starts to break. Thus I am thinking if airflow can have a cli 
>>>>>>>>>> tool to
>>>>>>>>>> help users to batch update these deprecating params just like 
>>>>>>>>>> python's
>>>>>>>>>> futurize tool. This can greatly ease the burden of upgrading the 
>>>>>>>>>> airflow
>>>>>>>>>> version.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> Ping
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Aug 3, 2022 at 3:13 PM Daniel Standish
>>>>>>>>>> <daniel.stand...@astronomer.io.invalid> wrote:
>>>>>>>>>>
>>>>>>>>>>> In case you forgot about us @Vikram, gentle nudge
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Aug 1, 2022 at 8:40 AM Daniel Standish <
>>>>>>>>>>> daniel.stand...@astronomer.io> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Well, this is a discussion thread, so let's discuss!
>>>>>>>>>>>>
>>>>>>>>>>>> Vikram, what kind of implications are you thinking of? Maybe you
>>>>>>>>>>>> could provide a little more detail about your concerns?
>>>>>>>>>>>>
>>>>>>>>>>>> There are certainly alternatives, and of course each of them has
>>>>>>>>>>>> tradeoffs.
>>>>>>>>>>>>
>>>>>>>>>>>> Here are the options I see:
>>>>>>>>>>>>
>>>>>>>>>>>> 1. consolidate to schedule (deprecate schedule_interval and
>>>>>>>>>>>> timetable; the proposal under discussion in this thread)
>>>>>>>>>>>> 2. do nothing (keep 3 different scheduling params)
>>>>>>>>>>>> 3. consolidate to schedule_interval (deprecate timetable)
>>>>>>>>>>>> 4. don't deprecate timetable, but remove schedule_on (AIP-48 has
>>>>>>>>>>>> added it in main) and take List[Dataset] in `schedule_interval`
>>>>>>>>>>>>
>>>>>>>>>>>> Which would you propose we go with?
>>>>>>>>>>>>
>>>>>>>>>>>> If we have a veto against 1 (and again, we're in hypothetical
>>>>>>>>>>>> territory since this is still a discussion thread), my vote would
>>>>>>>>>>>> definitely be to go with 3.
>>>>>>>>>>>>
>>>>>>>>>>>> The param `schedule_interval` as catch all is not *sooo *bad,
>>>>>>>>>>>> but `schedule` definitely seems better. From a new user's 
>>>>>>>>>>>> perspective,
>>>>>>>>>>>> having misleading names is confusing, and can be a negative when 
>>>>>>>>>>>> looking at
>>>>>>>>>>>> adoption. And for a tenured user, it can nag at you. And dealing 
>>>>>>>>>>>> with
>>>>>>>>>>>> deprecations as an Airflow user, well it's sort of part of the 
>>>>>>>>>>>> job, though
>>>>>>>>>>>> it can be bothersome, certainly.
>>>>>>>>>>>>
>>>>>>>>>>>> Just to look at / compare feasibility.
>>>>>>>>>>>>
>>>>>>>>>>>> Deprecating schedule_interval, it would be more prominent than
>>>>>>>>>>>> some other deprecations because it's a DAG param. That said, I 
>>>>>>>>>>>> think it's
>>>>>>>>>>>> pretty straightforward to handle. For one, it's at the cardinality 
>>>>>>>>>>>> of DAG
>>>>>>>>>>>> and not *task*, so there is less to change with than some other
>>>>>>>>>>>> deprecations. And being kwargs-only, it's an easy find and replace 
>>>>>>>>>>>> that
>>>>>>>>>>>> could be automated with an `upgrade-check` script. It's very easy 
>>>>>>>>>>>> to
>>>>>>>>>>>> detect, either to warn about, or ultimately, after upgrade, to 
>>>>>>>>>>>> fail the DAG
>>>>>>>>>>>> parse (which forces user to fix). Perhaps we could even experiment 
>>>>>>>>>>>> with a
>>>>>>>>>>>> deprecation fix tool, so that users could get help fixing 
>>>>>>>>>>>> deprecations, not
>>>>>>>>>>>> necessarily in the context of an upgrade, since we're not looking 
>>>>>>>>>>>> at 3.0
>>>>>>>>>>>> any time soon.
>>>>>>>>>>>>
>>>>>>>>>>>> We could compare with execution_date, if execution_date is to be
>>>>>>>>>>>> removed in 3.0. I think the deprecation of execution_date will be 
>>>>>>>>>>>> a more
>>>>>>>>>>>> difficult thing for users to handle. For one, potentially higher
>>>>>>>>>>>> cardinality, since it is used in tasks. But also, the use of 
>>>>>>>>>>>> execution_date
>>>>>>>>>>>> can be harder to detect, since it can be in templates and 
>>>>>>>>>>>> elsewhere in
>>>>>>>>>>>> custom operators. *(With schedule_interval, since it's a
>>>>>>>>>>>> top-level dag kwarg, it will be easier.)* And execution_date
>>>>>>>>>>>> was replaced with `logical_date` for similar reasons -- avoiding 
>>>>>>>>>>>> user
>>>>>>>>>>>> confusion.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Jul 29, 2022 at 2:42 PM Vikram Koka
>>>>>>>>>>>> <vik...@astronomer.io.invalid> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> -1 from me.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Though I agree in principle with the idea of consolidation, I
>>>>>>>>>>>>> don't think we should be doing this yet until we understand the
>>>>>>>>>>>>> implications completely.
>>>>>>>>>>>>> I am really not in favor of deprecation of the existing params,
>>>>>>>>>>>>> unless there is really no alternative.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, Jul 29, 2022 at 2:37 PM Daniel Standish
>>>>>>>>>>>>> <daniel.stand...@astronomer.io.invalid> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> So far, seems all in favor.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I will just highlight, in case it's not clear.... when we
>>>>>>>>>>>>>> release this (presumably 2.4), basically every single dag in 
>>>>>>>>>>>>>> existence will
>>>>>>>>>>>>>> start emitting deprecation warnings, and prior to 3.0, basically 
>>>>>>>>>>>>>> every dag
>>>>>>>>>>>>>> in existence will need to be updated.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thankfully, for most people, this should be an easy update
>>>>>>>>>>>>>> since DAG is kwargs only, IIRC.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, Jul 29, 2022, 2:27 PM Brent Bovenzi
>>>>>>>>>>>>>> <br...@astronomer.io.invalid> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> +1 to consolidating and "schedule".
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, Jul 29, 2022, 10:12 PM Drew Hubl
>>>>>>>>>>>>>>> <drew.h...@astronomer.io.invalid> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> +1 for ‘schedule', and another +1 to the importance of
>>>>>>>>>>>>>>>> consolidating to one being more important than the name of 
>>>>>>>>>>>>>>>> that one
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Jul 29, 2022, at 1:58 PM, Josh Fell <
>>>>>>>>>>>>>>>> josh.d.f...@astronomer.io.INVALID> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Consolidating to a single scheduling parameter is a big +1
>>>>>>>>>>>>>>>> from me. `schedule` seems like a nice catch-all.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Fri, Jul 29, 2022 at 9:10 AM Jed Cunningham <
>>>>>>>>>>>>>>>> jedcunning...@apache.org> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> +1 on moving to a single `schedule` param.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>

Reply via email to