Yep. I think immediate deprecation is the way to go. Those who care will change. For those who don't care - well, they don't care so it doesn't matter.
On Sat, Aug 6, 2022 at 9:10 PM Daniel Standish <daniel.stand...@astronomer.io.invalid> wrote: > It's not addressed explicitly in the vote. Personally I think we should > add a deprecation warning immediately. I understand that there is the > thought to "give users time to make the change" before adding the warning. > But deprecation warnings are probably the most effective, most relied upon > mechanism by which we signal that change is required. I don't really see a > benefit to delaying. And even though 3.0 is not anywhere in sight, some > users will probably end up staying on 2.4 until 3.0 đŸ¤·. Adding the warning > immediately maximizes the time that users have to make the change, and > maximizes the likelihood that users will receive the warning. That said, I > would certainly not veto if that's what the group wants to do. And if it > seems that folks are in favor of this, I can amend the vote thread to add > that provision. > > On Sat, Aug 6, 2022 at 12:01 PM Ash Berlin-Taylor <a...@apache.org> wrote: > >> One question: are we deprecating the existing parameters straight away or >> leaving it without a warning for one release? (I think someone talked about >> that at one point) >> >> On 6 August 2022 19:46:02 BST, Daniel Standish >> <daniel.stand...@astronomer.io.INVALID> wrote: >>> >>> Well, thanks for forcing us to make the case :) >>> >>> I'll initiate a lazy consensus "vote" >>> >>> >>> On Fri, Aug 5, 2022 at 1:01 PM Vikram Koka <vik...@astronomer.io.invalid> >>> wrote: >>> >>>> Fair points all. >>>> I withdraw my objection and agree with the consolidation proposal. >>>> >>>> Vikram >>>> >>>> >>>> On Fri, Aug 5, 2022 at 10:06 AM Jarek Potiuk <ja...@potiuk.com> wrote: >>>> >>>>> Let me also add to why I think we should consolidate to one param:: >>>>> >>>>> * "schedule_interval" >>>>> * "schedule_on" >>>>> * "timetable" >>>>> >>>>> Imagine you are a new user coming to Airflow. >>>>> >>>>> Do you immediately recognise which one is which ? Do you see what the >>>>> difference is? >>>>> >>>>> * "Schedule_on" = "Every Sunday" is equally good as "schedule_on" >>>>> "dataset". >>>>> * "timetable" - what even is a timetable? Is it a "cron TAB" ? Or >>>>> what? >>>>> >>>>> For me even if we wanted to have different names, the current set of >>>>> names for those params is just terribly ambiguous. >>>>> >>>>> Making it "schedule" and acting depending on what you pass to it is so >>>>> much better and simpler. >>>>> >>>>> J. >>>>> >>>>> >>>>> On Fri, Aug 5, 2022 at 6:57 PM Daniel Standish >>>>> <daniel.stand...@astronomer.io.invalid> wrote: >>>>> >>>>>> @Vikram, yeah, I understand that concern. Making a breaking change >>>>>> is something we shouldn't do carelessly, without consideration, without >>>>>> good reason. But I think we have a good reason here. >>>>>> >>>>>> Deprecations are somewhat regular in airflow. In 2.0 we moved a lot >>>>>> of imports. We also changed the CLI semantics. Probably there were >>>>>> other >>>>>> breaking changes that I'm forgetting *(granted sometimes we didn't >>>>>> remove the backward-compatibility immediately in 2.0, and that remains an >>>>>> option here)*. Probably just about every dag in existence had to be >>>>>> changed then too. And that's what a major release is for. >>>>>> >>>>>> In 3.0, it looks like we're deprecating execution_date... That's >>>>>> even bigger. So, were we to do this, (1) moving to `schedule` wouldn't >>>>>> be >>>>>> the only notable breaking change and (2) by comparison it's a pretty >>>>>> simple >>>>>> change. And if, as part of the upgrade to a major release, you have to >>>>>> do >>>>>> renames and such anyway, the marginal cost of one more isn't massive. >>>>>> >>>>>> I guess my take is that we can't be chained to our previous decisions >>>>>> forever. And we have to have the flexibility to adapt our interfaces so >>>>>> that they still make sense and are consistent and don't cause excessive >>>>>> eyebrow wrinkling. We have adopted semantic versioning to provide a >>>>>> mechanism for signaling when breaking changes are coming. And we can add >>>>>> tooling to help users along the way. >>>>>> >>>>>> Additionally, we can't *only *look at the burden of such a change >>>>>> for users. We must also weigh this against the burden that we maintain by >>>>>> *not* changing, i.e. by having too many params, or ill-named params. >>>>>> >>>>>> >>>>>> On Thu, Aug 4, 2022 at 2:57 PM Jed Cunningham < >>>>>> jedcunning...@apache.org> wrote: >>>>>> >>>>>>> I'm not in favor of reusing `schedule_interval` for datasets (or >>>>>>> folding `timetable` into it either), primarily because it doesn't feel >>>>>>> like >>>>>>> a good name to describe what it does. If we don't have the appetite to >>>>>>> consolidate, my vote is we stick with 3 separate params. So preferably >>>>>>> #1, >>>>>>> if not then #2 for me. >>>>>>> >>>>>>> I'm less concerned about the "everyone has to update every DAG" >>>>>>> issue, as it only becomes a problem for Airflow 3, folks can freely >>>>>>> ignore >>>>>>> it until then, and when they do update it it's a simple find/replace. >>>>>>> >>>>>>> >>>>>>> On Thu, Aug 4, 2022 at 2:17 PM Andrew Gibbs < >>>>>>> a.r.gibbs.massma...@gmail.com> wrote: >>>>>>> >>>>>>>> I'll just +1 that IMO a UI reminder is a great option. >>>>>>>> >>>>>>>> 2.3.0 added an alert when robots.txt was being scanned, which was >>>>>>>> great, because I ignored it for a while, but eventually caved and >>>>>>>> looked >>>>>>>> into it. This I would see as more of the same. I currently get loads of >>>>>>>> warnings about things in my config file that have moved; but typically >>>>>>>> only >>>>>>>> if I have to do a `airflow db upgrade` or `airflow dags backfill`. I >>>>>>>> see >>>>>>>> the alert, think "I must fix those", and then the Homer effect kicks >>>>>>>> in, >>>>>>>> and something else takes up the space in my brain. Having them as a >>>>>>>> banner >>>>>>>> in the GUI would 100% get me to sort it sooner or later. >>>>>>>> >>>>>>>> On Thu, Aug 4, 2022 at 9:32 PM Vikram Koka >>>>>>>> <vik...@astronomer.io.invalid> wrote: >>>>>>>> >>>>>>>>> Hi Daniel, >>>>>>>>> >>>>>>>>> I completely agree with wanting to consolidate, but I am very >>>>>>>>> concerned about people needing to change their existing code. >>>>>>>>> As Ping and Felix also said, I worry that people ignore >>>>>>>>> deprecation warnings, especially since the original code authors don't >>>>>>>>> really look at logs or other pipeline artifacts unless something is >>>>>>>>> broken. >>>>>>>>> >>>>>>>>> With respect to the options you outlined, I am personally leaning >>>>>>>>> towards (4) i.e. "don't deprecate timetable, but remove schedule_on >>>>>>>>> (AIP-48 >>>>>>>>> has added it in main) and take List[Dataset] in `schedule_interval`". >>>>>>>>> I >>>>>>>>> could be convinced with (3), since I think that "timetable" is >>>>>>>>> relatively >>>>>>>>> recent and have not seen as much usage for this invocation yet. I am >>>>>>>>> definitely curious about other perspectives on this as well. >>>>>>>>> >>>>>>>>> I have a hard time with option (1) of deprecating "schedule >>>>>>>>> interval", because of the historical nature of it and the code >>>>>>>>> changes that >>>>>>>>> people may have to make to deal with it. >>>>>>>>> >>>>>>>>> Thanks for raising this topic and the discussion, >>>>>>>>> >>>>>>>>> Vikram >>>>>>>>> >>>>>>>>> >>>>>>>>> On Thu, Aug 4, 2022 at 10:24 AM Daniel Standish >>>>>>>>> <daniel.stand...@astronomer.io.invalid> wrote: >>>>>>>>> >>>>>>>>>> I agree with those (including myself) who have suggested that it >>>>>>>>>> would be great to have a deprecation fix tool. >>>>>>>>>> >>>>>>>>>> I also think that UI notifications would be valuable. Another >>>>>>>>>> option would be to provide a less intrusive indicator somewhere >>>>>>>>>> (like when >>>>>>>>>> chrome lets you know you need to update), and then you click on it >>>>>>>>>> and it >>>>>>>>>> takes you to a page that lists out all the deprecation warnings. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Thu, Aug 4, 2022 at 1:03 AM Jarek Potiuk <ja...@potiuk.com> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> I know we are distracting a bit from the main subject, but I >>>>>>>>>>> think it's worth it. >>>>>>>>>>> >>>>>>>>>>> I think this is a fact of life that people don't remove >>>>>>>>>>> deprecations. But we have a choice of "not doing anything about it" >>>>>>>>>>> and >>>>>>>>>>> complaining. Or doing something that might improve it. >>>>>>>>>>> >>>>>>>>>>> I usually err on the side of "doing something" when I see a >>>>>>>>>>> problem. I am a doer :). >>>>>>>>>>> >>>>>>>>>>> And I have a possibly controversial proposal. I got the >>>>>>>>>>> inspiration from node.js old versions. Whenever an old version of >>>>>>>>>>> Node is >>>>>>>>>>> being used, you have a 20 seconds delay artificially added and >>>>>>>>>>> information >>>>>>>>>>> that in order to remove those 20 seconds you need to upgrade. >>>>>>>>>>> >>>>>>>>>>> Why don't we do that for some really old/serious deprecations ? >>>>>>>>>>> Maybe not 20 seconds. Maybe adding an unremovable RED warning in >>>>>>>>>>> the UI if >>>>>>>>>>> you have any deprecation unhandled. Something that is annoying >>>>>>>>>>> enough that >>>>>>>>>>> will cause people to make an action but not intrusive enough to >>>>>>>>>>> break >>>>>>>>>>> things. >>>>>>>>>>> >>>>>>>>>>> I think this is a very basic psychological trait: >>>>>>>>>>> >>>>>>>>>>> If it does not cost the user anything to leave the deprecations, >>>>>>>>>>> there will be a big number of people who will do that. >>>>>>>>>>> We need to make sure that deprecations are costing our users "a >>>>>>>>>>> little" and that they are aware that they can remove the cost (and >>>>>>>>>>> know how >>>>>>>>>>> - this is super important that we only show such a message with >>>>>>>>>>> clear >>>>>>>>>>> instructions on what to do). >>>>>>>>>>> >>>>>>>>>>> Such a message might be annoying in many ways: >>>>>>>>>>> >>>>>>>>>>> * it can obstruct significant part of the screen >>>>>>>>>>> * It can be flashy and had very annoying colours (think bold >>>>>>>>>>> pink fonts on flash-green background for example) >>>>>>>>>>> * It can not allow to do anything before the user closes it with >>>>>>>>>>> a REALLY small 'x' that it is difficult to point at (think cookie >>>>>>>>>>> banners). >>>>>>>>>>> And when they click they can be greeted with "You know - you can >>>>>>>>>>> actually >>>>>>>>>>> remove this warning by doing THIS". >>>>>>>>>>> * and we have probably more options - our UX specialists might >>>>>>>>>>> have some good ideas here >>>>>>>>>>> >>>>>>>>>>> For anyone who complains about "people do not act on >>>>>>>>>>> deprecation." - yeah this is true. >>>>>>>>>>> But one universal truth there is that if we won't act, this will >>>>>>>>>>> change nothing in the behaviour of our users - this is 100% certain. >>>>>>>>>>> >>>>>>>>>>> If we do something - it might improve it. We will not get a 100% >>>>>>>>>>> success for sure, but we can at least do something to improve it. >>>>>>>>>>> >>>>>>>>>>> WDYT? >>>>>>>>>>> >>>>>>>>>>> J. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Thu, Aug 4, 2022 at 6:52 AM Felix Uellendall >>>>>>>>>>> <felue...@pm.me.invalid> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi Ping, >>>>>>>>>>>> >>>>>>>>>>>> I had this idea about auto-fixing deprecation warnings too when >>>>>>>>>>>> I started the airflint project. >>>>>>>>>>>> I totally agree with what you said that often people ignore >>>>>>>>>>>> these warnings and therefore a cli tool would come in very handy. >>>>>>>>>>>> It could >>>>>>>>>>>> look similar to the airflint project and also be based on the >>>>>>>>>>>> python >>>>>>>>>>>> refactor package, but I would keep it out of this discussion for >>>>>>>>>>>> now. >>>>>>>>>>>> >>>>>>>>>>>> Regarding the schedule arg, I don’t have a strong opinion there >>>>>>>>>>>> but also tend to having one schedule arg, because of >>>>>>>>>>>> simplicity for the dag author. >>>>>>>>>>>> >>>>>>>>>>>> Best, >>>>>>>>>>>> Felix >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Sent from Proton Mail for iOS >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Thu, Aug 4, 2022 at 00:23, Ping Zhang <pin...@umich.edu> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> Hi Daniel, >>>>>>>>>>>> >>>>>>>>>>>> +1 for ‘schedule' and thanks for bringing this up. >>>>>>>>>>>> >>>>>>>>>>>> I agree with Vikram, we should be very careful about >>>>>>>>>>>> deprecating existing params even if we have warnings around it. >>>>>>>>>>>> Not sure if >>>>>>>>>>>> this is a general case, but I notice that people ignore warnings >>>>>>>>>>>> all the >>>>>>>>>>>> time until it starts to break. Thus I am thinking if airflow can >>>>>>>>>>>> have a cli >>>>>>>>>>>> tool to help users to batch update these deprecating params just >>>>>>>>>>>> like >>>>>>>>>>>> python's futurize tool. This can greatly ease the burden of >>>>>>>>>>>> upgrading the >>>>>>>>>>>> airflow version. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Ping >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Wed, Aug 3, 2022 at 3:13 PM Daniel Standish >>>>>>>>>>>> <daniel.stand...@astronomer.io.invalid> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> In case you forgot about us @Vikram, gentle nudge >>>>>>>>>>>>> >>>>>>>>>>>>> On Mon, Aug 1, 2022 at 8:40 AM Daniel Standish < >>>>>>>>>>>>> daniel.stand...@astronomer.io> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Well, this is a discussion thread, so let's discuss! >>>>>>>>>>>>>> >>>>>>>>>>>>>> Vikram, what kind of implications are you thinking of? Maybe >>>>>>>>>>>>>> you could provide a little more detail about your concerns? >>>>>>>>>>>>>> >>>>>>>>>>>>>> There are certainly alternatives, and of course each of them >>>>>>>>>>>>>> has tradeoffs. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Here are the options I see: >>>>>>>>>>>>>> >>>>>>>>>>>>>> 1. consolidate to schedule (deprecate schedule_interval and >>>>>>>>>>>>>> timetable; the proposal under discussion in this thread) >>>>>>>>>>>>>> 2. do nothing (keep 3 different scheduling params) >>>>>>>>>>>>>> 3. consolidate to schedule_interval (deprecate timetable) >>>>>>>>>>>>>> 4. don't deprecate timetable, but remove schedule_on (AIP-48 >>>>>>>>>>>>>> has added it in main) and take List[Dataset] in >>>>>>>>>>>>>> `schedule_interval` >>>>>>>>>>>>>> >>>>>>>>>>>>>> Which would you propose we go with? >>>>>>>>>>>>>> >>>>>>>>>>>>>> If we have a veto against 1 (and again, we're in hypothetical >>>>>>>>>>>>>> territory since this is still a discussion thread), my vote would >>>>>>>>>>>>>> definitely be to go with 3. >>>>>>>>>>>>>> >>>>>>>>>>>>>> The param `schedule_interval` as catch all is not *sooo *bad, >>>>>>>>>>>>>> but `schedule` definitely seems better. From a new user's >>>>>>>>>>>>>> perspective, >>>>>>>>>>>>>> having misleading names is confusing, and can be a negative when >>>>>>>>>>>>>> looking at >>>>>>>>>>>>>> adoption. And for a tenured user, it can nag at you. And dealing >>>>>>>>>>>>>> with >>>>>>>>>>>>>> deprecations as an Airflow user, well it's sort of part of the >>>>>>>>>>>>>> job, though >>>>>>>>>>>>>> it can be bothersome, certainly. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Just to look at / compare feasibility. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Deprecating schedule_interval, it would be more prominent >>>>>>>>>>>>>> than some other deprecations because it's a DAG param. That >>>>>>>>>>>>>> said, I think >>>>>>>>>>>>>> it's pretty straightforward to handle. For one, it's at the >>>>>>>>>>>>>> cardinality of >>>>>>>>>>>>>> DAG and not *task*, so there is less to change with than >>>>>>>>>>>>>> some other deprecations. And being kwargs-only, it's an easy >>>>>>>>>>>>>> find and >>>>>>>>>>>>>> replace that could be automated with an `upgrade-check` script. >>>>>>>>>>>>>> It's very >>>>>>>>>>>>>> easy to detect, either to warn about, or ultimately, after >>>>>>>>>>>>>> upgrade, to fail >>>>>>>>>>>>>> the DAG parse (which forces user to fix). Perhaps we could even >>>>>>>>>>>>>> experiment >>>>>>>>>>>>>> with a deprecation fix tool, so that users could get help fixing >>>>>>>>>>>>>> deprecations, not necessarily in the context of an upgrade, >>>>>>>>>>>>>> since we're not >>>>>>>>>>>>>> looking at 3.0 any time soon. >>>>>>>>>>>>>> >>>>>>>>>>>>>> We could compare with execution_date, if execution_date is to >>>>>>>>>>>>>> be removed in 3.0. I think the deprecation of execution_date >>>>>>>>>>>>>> will be a more >>>>>>>>>>>>>> difficult thing for users to handle. For one, potentially higher >>>>>>>>>>>>>> cardinality, since it is used in tasks. But also, the use of >>>>>>>>>>>>>> execution_date >>>>>>>>>>>>>> can be harder to detect, since it can be in templates and >>>>>>>>>>>>>> elsewhere in >>>>>>>>>>>>>> custom operators. *(With schedule_interval, since it's a >>>>>>>>>>>>>> top-level dag kwarg, it will be easier.)* And execution_date >>>>>>>>>>>>>> was replaced with `logical_date` for similar reasons -- avoiding >>>>>>>>>>>>>> user >>>>>>>>>>>>>> confusion. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, Jul 29, 2022 at 2:42 PM Vikram Koka >>>>>>>>>>>>>> <vik...@astronomer.io.invalid> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> -1 from me. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Though I agree in principle with the idea of consolidation, >>>>>>>>>>>>>>> I don't think we should be doing this yet until we understand >>>>>>>>>>>>>>> the >>>>>>>>>>>>>>> implications completely. >>>>>>>>>>>>>>> I am really not in favor of deprecation of the existing >>>>>>>>>>>>>>> params, unless there is really no alternative. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Fri, Jul 29, 2022 at 2:37 PM Daniel Standish >>>>>>>>>>>>>>> <daniel.stand...@astronomer.io.invalid> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> So far, seems all in favor. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I will just highlight, in case it's not clear.... when we >>>>>>>>>>>>>>>> release this (presumably 2.4), basically every single dag in >>>>>>>>>>>>>>>> existence will >>>>>>>>>>>>>>>> start emitting deprecation warnings, and prior to 3.0, >>>>>>>>>>>>>>>> basically every dag >>>>>>>>>>>>>>>> in existence will need to be updated. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thankfully, for most people, this should be an easy update >>>>>>>>>>>>>>>> since DAG is kwargs only, IIRC. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Fri, Jul 29, 2022, 2:27 PM Brent Bovenzi >>>>>>>>>>>>>>>> <br...@astronomer.io.invalid> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> +1 to consolidating and "schedule". >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Fri, Jul 29, 2022, 10:12 PM Drew Hubl >>>>>>>>>>>>>>>>> <drew.h...@astronomer.io.invalid> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> +1 for ‘schedule', and another +1 to the importance of >>>>>>>>>>>>>>>>>> consolidating to one being more important than the name of >>>>>>>>>>>>>>>>>> that one >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Jul 29, 2022, at 1:58 PM, Josh Fell < >>>>>>>>>>>>>>>>>> josh.d.f...@astronomer.io.INVALID> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Consolidating to a single scheduling parameter is a big >>>>>>>>>>>>>>>>>> +1 from me. `schedule` seems like a nice catch-all. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Fri, Jul 29, 2022 at 9:10 AM Jed Cunningham < >>>>>>>>>>>>>>>>>> jedcunning...@apache.org> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> +1 on moving to a single `schedule` param. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>