Hey everyone,

I keep on being involved in discussions where people are complaining
about how bad and useless the SLA feature of Airflow is. And yeah, I
pretty much agree with it.

Without getting into details of why it is bad - should we possibly
just, well, deprecate it? I think that would give a much stronger
signal to our users if they keep on getting warnings that the feature
is deprecated and when we officially deprecate it in the docs that
they should not rely on it.

I also think that possibly we do not have to replace it with an
equivalent/better SLA feature.

I personally think Airflow on its own should not provide such
SLA/monitoring features, but it should become more of the platform
that provides useful metrics that will enable other - more dedicated
systems - to do the job of monitoring and alerting - and the native
Airflow UI should be more of a "management" than "monitoring".

With (already approved) Open Telemetry support
https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-49+OpenTelemetry+Support+for+Apache+Airflow
integrating such monitoring solution should become much easier. Also
with possible newer and more sophisticated metrics (such as those
proposed by Ping in
https://lists.apache.org/thread/g52vk2p7l4nf6on436mbdzwrqstld7jl )
this opens up to more sophisticated usages, that Airflow will never be
able to match with built-in SLA/monitoring features.

Also even today there are better ways to achieve SLA functionality -
Good and successful story about it has been told by Eden from Fyber at
the Summit:  
https://airflowsummit.org/sessions/2022/the-slayer-your-data-pipeline-needs/

Making SLA deprecate would give a signal to the users that this is the
long-term, recommended approach.

J.

Reply via email to