Cool to see that people still need SLA in Airflow. I see the points why and they make perfect sense. Glad we discussed it :)
Still - WDYT do we want to signal current SLA as "deprecated" now (even if we do not know what will replace it?) - at least as a signal to stay away from it. Or do we want to keep the status quo? I think it will be hard to "fix" the SLA as it stands now without breaking a lot of compatibility so we can deprecate it even now (and we can raise warnings and mention that it will be replaced by something better). I believe - for example - this was a mistake not doing that with "experimental" API early enough which got more people using it and harder to make them stop. J. On Wed, Jul 13, 2022 at 6:51 PM Vikram Koka <vik...@astronomer.io.invalid> wrote: > > - resending below to keep the same thread as Ping's response. My prior > response and Ping's were sent at the same time, but I did not two two email > threads -- > > > I understand the frustration with the SLA feature as it stands. > I struggled with trying to understand this early on and finally understood > how they were broken. > > Having said that, I believe that Airflow users strongly care about the > timeliness and consequently SLAs of their data. I also believe that these > need to be redefined on top of Datasets in AIP-48 (as Ash mentioned earlier). > > In summary, my view is that this needs to be redefined and fixed in Airflow, > rather than entirely removed from Airflow. > Having said that, I also believe that people will want to build advanced SLA > capabilities on top of Airflow, but that's completely reasonable in my book. > > On Wed, Jul 13, 2022 at 8:31 PM Ping Zhang <pin...@umich.edu> wrote: >> >> Hi Jarek, >> >> Thanks for bringing this up. >> >> I agree the SLA feature needs some work. However, I think we want an >> equivalent SLA feature as it is still very useful. >> >> Thanks, >> >> Ping >> >> >> On Tue, Jul 12, 2022 at 12:42 AM Jarek Potiuk <ja...@potiuk.com> wrote: >>> >>> Hey everyone, >>> >>> I keep on being involved in discussions where people are complaining >>> about how bad and useless the SLA feature of Airflow is. And yeah, I >>> pretty much agree with it. >>> >>> Without getting into details of why it is bad - should we possibly >>> just, well, deprecate it? I think that would give a much stronger >>> signal to our users if they keep on getting warnings that the feature >>> is deprecated and when we officially deprecate it in the docs that >>> they should not rely on it. >>> >>> I also think that possibly we do not have to replace it with an >>> equivalent/better SLA feature. >>> >>> I personally think Airflow on its own should not provide such >>> SLA/monitoring features, but it should become more of the platform >>> that provides useful metrics that will enable other - more dedicated >>> systems - to do the job of monitoring and alerting - and the native >>> Airflow UI should be more of a "management" than "monitoring". >>> >>> With (already approved) Open Telemetry support >>> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-49+OpenTelemetry+Support+for+Apache+Airflow >>> integrating such monitoring solution should become much easier. Also >>> with possible newer and more sophisticated metrics (such as those >>> proposed by Ping in >>> https://lists.apache.org/thread/g52vk2p7l4nf6on436mbdzwrqstld7jl ) >>> this opens up to more sophisticated usages, that Airflow will never be >>> able to match with built-in SLA/monitoring features. >>> >>> Also even today there are better ways to achieve SLA functionality - >>> Good and successful story about it has been told by Eden from Fyber at >>> the Summit: >>> https://airflowsummit.org/sessions/2022/the-slayer-your-data-pipeline-needs/ >>> >>> Making SLA deprecate would give a signal to the users that this is the >>> long-term, recommended approach. >>> >>> J.