Re: [I] Scheduler is spending most of its time in clear_not_launched_queued_tasks function [airflow]

via GitHub Thu, 12 Oct 2023 09:05:14 -0700


ROVAN1220 commented on issue #34877:
URL: https://github.com/apache/airflow/issues/34877#issuecomment-1759924519


   It seems like you've identified a performance bottleneck in your Airflow 
setup when running on a large Kubernetes cluster with a high number of queued 
tasks. Your proposed solution of making batch calls to get all the Airflow 
worker pods instead of making individual calls for each task is a reasonable 
approach to address the issue. Here are some steps you can take to optimize the 
situation:
   
   Batch API Calls: Modify the clear_not_launched_queued_tasks function to make 
batch calls to the Kubernetes API to fetch information about all the Airflow 
worker pods. This will significantly reduce the overhead of making individual 
API requests for each queued task.
   
   Optimize Query Filters: When querying for pods, ensure you use efficient 
filters to fetch only the necessary information. For example, you may want to 
filter by labels or other criteria to narrow down the list of relevant pods.
   
   Caching: Consider implementing a caching mechanism to store information 
about worker pods, so you don't need to query the Kubernetes API every time the 
function runs. You can set up a cache expiration strategy to periodically 
refresh the pod information.
   
   Throttling: If the Kubernetes API calls are still causing performance 
issues, you can implement a throttling mechanism to limit the frequency and 
number of API calls. This can help balance the load on the API server.
   
   Scaling Resources: In case the Kubernetes cluster is continually growing or 
experiencing resource constraints, consider scaling your cluster to ensure that 
it can handle the increased workload efficiently.
   
   Tune Airflow Scheduler Settings: Review the Airflow scheduler settings and 
parameters to optimize its performance. For example, you can adjust the 
scheduling_interval, max_threads, or other configuration options to better 
align with your cluster's capacity.
   
   Asynchronous Processing: If possible, you may explore asynchronous 
processing for certain tasks that don't need immediate scheduling, which can 
help reduce the load on the scheduler.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [I] Scheduler is spending most of its time in clear_not_launched_queued_tasks function [airflow]

Reply via email to