Re: [DISCUSS] Airflow Scheduling Delay Metric Definition

2022-07-26 Thread Jarek Potiuk
CAN modify >>>>> >> Airflow code and add missing features and functionality to capture >>>>> the >>>>> >> necessary metric data in the code, rather than using triggers. We >>>>> >> could even define some kind of callback

Re: [DISCUSS] Airflow Scheduling Delay Metric Definition

2022-07-26 Thread Ping Zhang
for, once we have >>>> >> Open-Telemetry integrated we could add more and more such useful >>>> >> metrics more easily, and that could be way more useful, because >>>> >> instead of running external custom-db-reading process for that, we >>>&g

Re: [DISCUSS] Airflow Scheduling Delay Metric Definition

2022-07-26 Thread Jarek Potiuk
t; >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> On Thu, Jun 30, 2022 at 4:52 PM Vikram Koka >>> >> wrote: >>> >> > >>> >> > HI Ping, >>> >> > >>> >> > Apologies for the be

Re: [DISCUSS] Airflow Scheduling Delay Metric Definition

2022-07-25 Thread Ping Zhang
; >> >> > >> >> > We don't use the tasks that don't have any upstream tasks in this >> metric for measuring task lag. >> >> > And for tasks that have multiple upstream tasks, we use the upstream >> task

Re: [DISCUSS] Airflow Scheduling Delay Metric Definition

2022-07-13 Thread Ping Zhang
t;> > On Wed, Jun 8, 2022 at 2:58 PM Ping Zhang wrote: > >> >> > >> >> Hi Mehta, > >> >> > >> >> Good point. The primary goal of the metric is for stress testing to > catch airflow scheduler performance regression for 1) our in

Re: [DISCUSS] Airflow Scheduling Delay Metric Definition

2022-07-12 Thread Jarek Potiuk
n of all parent tasks before scheduling any downstream task. > >> > > >> > Vikram > >> > > >> > > >> > On Wed, Jun 8, 2022 at 2:58 PM Ping Zhang wrote: > >> >> > >> >> Hi Mehta, >

Re: [DISCUSS] Airflow Scheduling Delay Metric Definition

2022-07-12 Thread Jarek Potiuk
t; >> airflow scheduler performance regression for 1) our internal scheduler >> >> improvement work and 2) airflow version upgrade. >> >> >> >> One of the key benefits of this metric definition is it is independent >> >> from the scheduler

Re: [DISCUSS] Airflow Scheduling Delay Metric Definition

2022-07-12 Thread Ping Zhang
d, Jun 8, 2022 at 2:36 PM Mehta, Shubham > wrote: > >>> > >>> Ping, > >>> > >>> > >>> > >>> I’m very interested in this as well. A good metric can help us > benchmark and identify potential improvements in the scheduler perfo

Re: [DISCUSS] Airflow Scheduling Delay Metric Definition

2022-07-11 Thread Jarek Potiuk
formance. >>> In order to understand the proposal better, can you please share where and >>> how do you intend to use “Scheduling delay”? Is it meant for benchmarking >>> or stress testing only? Do you plan to expose it to the users in the >>> Airflow UI?

Re: [DISCUSS] Airflow Scheduling Delay Metric Definition

2022-06-30 Thread Vikram Koka
;> and how do you intend to use “Scheduling delay”? Is it meant for >> benchmarking or stress testing only? Do you plan to expose it to the users >> in the Airflow UI? >> >> >> >> Thanks >> Shubham >> >> >> >> >> >> *From

Re: [DISCUSS] Airflow Scheduling Delay Metric Definition

2022-06-08 Thread Ping Zhang
Do you plan to expose it to the users in the > Airflow UI? > > > > Thanks > Shubham > > > > > > *From: *Ping Zhang > *Reply-To: *"dev@airflow.apache.org" > *Date: *Wednesday, June 8, 2022 at 11:58 AM > *To: *"dev@airflow.apache.org"

Re: [DISCUSS] Airflow Scheduling Delay Metric Definition

2022-06-08 Thread Mehta, Shubham
E: [EXTERNAL][DISCUSS] Airflow Scheduling Delay Metric Definition CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. Hi Vikram, Thanks for pointing that out, 'task latency',

Re: [DISCUSS] Airflow Scheduling Delay Metric Definition

2022-06-08 Thread Ping Zhang
Hi Vikram, Thanks for pointing that out, 'task latency', "we define task latency as the time it takes for a task to begin executing > once its dependencies have been met." It will be great if you can elaborate more about "begin executing" and how you calculate "its dependencies have been met.".

Re: [DISCUSS] Airflow Scheduling Delay Metric Definition

2022-06-08 Thread Vikram Koka
Sorry, I should have asked my question differently. The "task latency" and "task throughput" metrics was to focus on the Scheduler performance to identify issues across releases from a software development perspective. However, the "scheduling delay" metric you are proposing seems to be addressing

Re: [DISCUSS] Airflow Scheduling Delay Metric Definition

2022-06-08 Thread Vikram Koka
Ping, I am quite interested in this topic and trying to understand the difference between the "scheduling delay" metric articulated as compared to the "task latency" aka "task lag" metric which we have been using before. As you may recall, we have been using two specific metrics to benchmark Sche

[DISCUSS] Airflow Scheduling Delay Metric Definition

2022-06-08 Thread Ping Zhang
Hi Airflow Community, Airflow is a scheduling platform for data pipelines, however there is no good metric to measure the scheduling delay in the production and also the stress test environment. This makes it hard to catch regressions in the scheduler during the stress test stage. I would like to