Hey Sergio, here's an example of how we use the built in healthcheck for
the webserver -
https://github.com/astronomer/airflow-chart/blob/master/templates/webserver/webserver-deployment.yaml#L94-L117.
This is just ensuring that the webserver can return a 200 on this request,
rather than examining the output of the response.

We do something a little different on the scheduler -
https://github.com/astronomer/airflow-chart/blob/master/templates/scheduler/scheduler-deployment.yaml#L94-L112.
This execs into the scheduler and checks on the last heartbeat from the
database. If it's been too long, the scheduler will get rebooted.

Hope that helps.

On Tue, Apr 21, 2020 at 3:37 AM Sergio Kef <[email protected]> wrote:

> Hi folks,
>
> We currently deploy Airflow on Kubernetes (using custom image, migrating to
> official image is planned) and we use Local executor (changing to
> Kubernetes executor also in plans).
> We meet the following problem:
> For cost efficiency, our testing cluster is scaled down every night. Then
> every morning the pod running airflow is up, but not healthy. The issue
> comes from the way we start scheduler and webserver. Since they are 2
> processes, we should have something like supervisord to handle them.
>
> Now my question is, given that we have a check-health
> <https://airflow.apache.org/docs/stable/howto/check-health.html>, how
> could
> it be used in liveness/probe check from k8 so it understands that pod is
> not healthy any more and it should redeploy it?
>
> Have others met similar issues? If so how did you approach?
>
> Sergio.
>


-- 
*Greg Neiheisel* / Chief Architect Astronomer.io

Reply via email to