GitHub user brokenjacobs added a comment to the discussion: Best practice for 
webserver liveness probe check

This is a kubernetes deployment. The liveness probe comes from kubelet. The 
kubernetes events (not logs) indicate a timeout checking for liveness, and the 
pod eventually receives a SIGTERM (which is in the api server log) from kubelet 
because liveness fails. There is no OOM kill. OOM Killer sends SIGKILL. 

Airflow 2 did not have issues with liveness when the api was lightly used. This 
is something new with airflow3. I’ve extended the timeouts and retries for 
liveness, and still have the same issues. I’ve increased pod resources (more 
cpu/ram) and even switched to a faster instance type for running the pod. Still 
having these issues. 

GitHub link: 
https://github.com/apache/airflow/discussions/54853#discussioncomment-14427696

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to