- resending below to keep the same thread as Ping's response. My prior response and Ping's were sent at the same time, but I did not two two email threads --
I understand the frustration with the SLA feature as it stands. I struggled with trying to understand this early on and finally understood how they were broken. Having said that, I believe that Airflow users strongly care about the timeliness and consequently SLAs of their data. I also believe that these need to be redefined on top of Datasets in AIP-48 (as Ash mentioned earlier). In summary, my view is that this needs to be redefined and fixed in Airflow, rather than entirely removed from Airflow. Having said that, I also believe that people will want to build advanced SLA capabilities on top of Airflow, but that's completely reasonable in my book. On Wed, Jul 13, 2022 at 8:31 PM Ping Zhang <pin...@umich.edu> wrote: > Hi Jarek, > > Thanks for bringing this up. > > I agree the SLA feature needs some work. However, I think we want an > equivalent SLA feature as it is still very useful. > > Thanks, > > Ping > > > On Tue, Jul 12, 2022 at 12:42 AM Jarek Potiuk <ja...@potiuk.com> wrote: > >> Hey everyone, >> >> I keep on being involved in discussions where people are complaining >> about how bad and useless the SLA feature of Airflow is. And yeah, I >> pretty much agree with it. >> >> Without getting into details of why it is bad - should we possibly >> just, well, deprecate it? I think that would give a much stronger >> signal to our users if they keep on getting warnings that the feature >> is deprecated and when we officially deprecate it in the docs that >> they should not rely on it. >> >> I also think that possibly we do not have to replace it with an >> equivalent/better SLA feature. >> >> I personally think Airflow on its own should not provide such >> SLA/monitoring features, but it should become more of the platform >> that provides useful metrics that will enable other - more dedicated >> systems - to do the job of monitoring and alerting - and the native >> Airflow UI should be more of a "management" than "monitoring". >> >> With (already approved) Open Telemetry support >> >> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-49+OpenTelemetry+Support+for+Apache+Airflow >> integrating such monitoring solution should become much easier. Also >> with possible newer and more sophisticated metrics (such as those >> proposed by Ping in >> https://lists.apache.org/thread/g52vk2p7l4nf6on436mbdzwrqstld7jl ) >> this opens up to more sophisticated usages, that Airflow will never be >> able to match with built-in SLA/monitoring features. >> >> Also even today there are better ways to achieve SLA functionality - >> Good and successful story about it has been told by Eden from Fyber at >> the Summit: >> https://airflowsummit.org/sessions/2022/the-slayer-your-data-pipeline-needs/ >> >> Making SLA deprecate would give a signal to the users that this is the >> long-term, recommended approach. >> >> J. >> >