Hey everyone, I keep on being involved in discussions where people are complaining about how bad and useless the SLA feature of Airflow is. And yeah, I pretty much agree with it.
Without getting into details of why it is bad - should we possibly just, well, deprecate it? I think that would give a much stronger signal to our users if they keep on getting warnings that the feature is deprecated and when we officially deprecate it in the docs that they should not rely on it. I also think that possibly we do not have to replace it with an equivalent/better SLA feature. I personally think Airflow on its own should not provide such SLA/monitoring features, but it should become more of the platform that provides useful metrics that will enable other - more dedicated systems - to do the job of monitoring and alerting - and the native Airflow UI should be more of a "management" than "monitoring". With (already approved) Open Telemetry support https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-49+OpenTelemetry+Support+for+Apache+Airflow integrating such monitoring solution should become much easier. Also with possible newer and more sophisticated metrics (such as those proposed by Ping in https://lists.apache.org/thread/g52vk2p7l4nf6on436mbdzwrqstld7jl ) this opens up to more sophisticated usages, that Airflow will never be able to match with built-in SLA/monitoring features. Also even today there are better ways to achieve SLA functionality - Good and successful story about it has been told by Eden from Fyber at the Summit: https://airflowsummit.org/sessions/2022/the-slayer-your-data-pipeline-needs/ Making SLA deprecate would give a signal to the users that this is the long-term, recommended approach. J.