Thanks for bringing this up. I agree SLAs have been broken forever and try to 
stay away from them.

However, I do see people trying to use them (not knowing it's broken). While I 
appreciate the effort others have made to build a system around Airflow for 
correct SLA alerting, I think Airflow is the right place to check for SLAs — 
Airflow starts and manages other processes after all, which to me feels like 
the place for SLA decisions to be made.

I have some implementation ideas but will save those for later, curious to hear 
thoughts first.

Bas

> On 12 Jul 2022, at 09:42, Jarek Potiuk <ja...@potiuk.com> wrote:
> 
> Hey everyone,
> 
> I keep on being involved in discussions where people are complaining
> about how bad and useless the SLA feature of Airflow is. And yeah, I
> pretty much agree with it.
> 
> Without getting into details of why it is bad - should we possibly
> just, well, deprecate it? I think that would give a much stronger
> signal to our users if they keep on getting warnings that the feature
> is deprecated and when we officially deprecate it in the docs that
> they should not rely on it.
> 
> I also think that possibly we do not have to replace it with an
> equivalent/better SLA feature.
> 
> I personally think Airflow on its own should not provide such
> SLA/monitoring features, but it should become more of the platform
> that provides useful metrics that will enable other - more dedicated
> systems - to do the job of monitoring and alerting - and the native
> Airflow UI should be more of a "management" than "monitoring".
> 
> With (already approved) Open Telemetry support
> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-49+OpenTelemetry+Support+for+Apache+Airflow
> integrating such monitoring solution should become much easier. Also
> with possible newer and more sophisticated metrics (such as those
> proposed by Ping in
> https://lists.apache.org/thread/g52vk2p7l4nf6on436mbdzwrqstld7jl )
> this opens up to more sophisticated usages, that Airflow will never be
> able to match with built-in SLA/monitoring features.
> 
> Also even today there are better ways to achieve SLA functionality -
> Good and successful story about it has been told by Eden from Fyber at
> the Summit:  
> https://airflowsummit.org/sessions/2022/the-slayer-your-data-pipeline-needs/
> 
> Making SLA deprecate would give a signal to the users that this is the
> long-term, recommended approach.
> 
> J.

Reply via email to