GitHub user brokenjacobs added a comment to the discussion: Best practice for webserver liveness probe check
This is a kubernetes deployment. The liveness probe comes from kubelet. The kubernetes events (not logs) indicate a timeout checking for liveness, and the pod eventually receives a SIGTERM (which is in the api server log) from kubelet because liveness fails. There is no OOM kill. OOM Killer sends SIGKILL. Airflow 2 did not have issues with liveness when the api was lightly used. This is something new with airflow3. I’ve extended the timeouts and retries for liveness, and still have the same issues. I’ve increased pod resources (more cpu/ram) and even switched to a faster instance type for running the pod. Still having these issues. GitHub link: https://github.com/apache/airflow/discussions/54853#discussioncomment-14427696 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
