Re: [DISCUSS] Airflow Scheduling Delay Metric Definition

Vikram Koka Wed, 08 Jun 2022 11:30:50 -0700

Sorry, I should have asked my question differently.
The "task latency" and "task throughput" metrics was to focus on the
Scheduler performance to identify issues across releases from a software
development perspective.


However, the "scheduling delay" metric you are proposing seems to be
addressing a somewhat different need.
That's what I am trying to understand.


On Wed, Jun 8, 2022 at 11:24 AM Vikram Koka <[email protected]> wrote:

> Ping,
>
> I am quite interested in this topic and trying to understand the
> difference between the "scheduling delay" metric articulated as compared to
> the "task latency" aka "task lag" metric which we have been using before.
>
> As you may recall, we have been using two specific metrics to
> benchmark Scheduler performance, specifically "task latency" and "task
> throughput" since Airflow 2.0.
> These were described in the 2.0 Scheduler blog post
> <https://www.astronomer.io/blog/airflow-2-scheduler/>
> Specifically, within that we defined task tatency as the time it takes for
> the task to begin executing once it's dependencies are all met.
>
> Thanks,
> Vikram
>
>
>
>
> On Wed, Jun 8, 2022 at 10:25 AM Ping Zhang <[email protected]> wrote:
>
>> Hi Airflow Community,
>>
>> Airflow is a scheduling platform for data pipelines, however there is no
>> good metric to measure the scheduling delay in the production and also the
>> stress test environment. This makes it hard to catch regressions in the
>> scheduler during the stress test stage.
>>
>> I would like to propose an airflow scheduling delay metric
>> definition. Here is the detailed design of the metric and its
>> implementation:
>>
>>
>> https://docs.google.com/document/d/1NhO26kgWkIZJEe50M60yh_jgROaU84dRJ5qGFqbkNbU/edit?usp=sharing
>>
>> Please take a look and any feedback is welcome.
>>
>> Thanks,
>>
>> Ping
>>
>>

Re: [DISCUSS] Airflow Scheduling Delay Metric Definition

Reply via email to