Adding to Greg's point, if you're using the k8s executor and for some
reason the k8s executor worker pod fails to launch within 120 seconds (e.g.
pending due to scaling up a new node), this counts as a task failure. Also,
if the k8s executor pod has already launched a pod operator but is killed
(e.g. manually or due to node upgrade), the  pod operator it launched is
not killed and runs to completion so if using retries, you need to ensure
idempotency. The worker pods update the db per my understanding, with each
requiring a separate connection to the db, this can tax your connection
budget (100-300 for small postgres instances on gcp or aws).

On Thu, Aug 30, 2018 at 6:04 PM Greg Neiheisel <g...@astronomer.io> wrote:

> Hey Kyle, the task pods will continue to run even if you reboot the
> scheduler and webserver and the status does get updated in the airflow db,
> which is great.
>
> I know the scheduler subscribes to the Kubernetes watch API to get an event
> stream of pods completing and it keeps a checkpoint so it can resubscribe
> when it comes back up.
>
> I forget if the worker pods update the db or if the scheduler is doing
> that, but it should work out.
>
> On Thu, Aug 30, 2018, 9:54 AM Kyle Hamlin <hamlin...@gmail.com> wrote:
>
> > gentle bump
> >
> > On Wed, Aug 22, 2018 at 5:12 PM Kyle Hamlin <hamlin...@gmail.com> wrote:
> >
> > > I'm about to make the switch to Kubernetes with Airflow, but am
> wondering
> > > what happens when my CI/CD pipeline redeploys the webserver and
> scheduler
> > > and there are still long-running tasks (pods). My intuition is that
> since
> > > the database hold all state and the tasks are in charge of updating
> their
> > > own state, and the UI only renders what it sees in the database that
> this
> > > is not so much of a problem. To be sure, however, here are my
> questions:
> > >
> > > Will task pods continue to run?
> > > Can task pods continue to poll the external system they are running
> tasks
> > > on while being "headless"?
> > > Can the tasks pods change/update state in the database while being
> > > "headless"?
> > > Will the UI/Scheduler still be aware of the tasks (pods) once they are
> > > live again?
> > >
> > > Is there anything else the might cause issues when deploying while
> tasks
> > > (pods) are running that I'm not thinking of here?
> > >
> > > Kyle Hamlin
> > >
> >
> >
> > --
> > Kyle Hamlin
> >
>

Reply via email to