Yep, we can absolutely pull it into the airflow repo. We've also been building up a test suite that currently runs on CircleCI and uses kind (Kubernetes in Docker) to test several kubernetes versions with some different settings. Right now we're mostly testing the different executors since that has the biggest impact on what gets deployed, but that can be expanded.
What CRDs are currently being used to run Airflow for the tests? On Wed, Mar 25, 2020 at 11:06 AM Jarek Potiuk <jarek.pot...@polidea.com> wrote: > One thing for the donation. > > Did you you want to have separate repository Greg ? > > I think we should simply create a folder in Airflow repo and keep it > there (similarly as we keep Dockerfile). I am going to switch our > Kubernetes Tests to the production image (will make the tests much > faster) and I am going to test the Dockerfile automatically in CI - > for now we are using some custom Resource definitions to start Airflow > on Kubernetes Cluster for the tests, but we could switch to using the > helm chart - this way we can test all three things at once: > - Kubernetes Executor > - Dockerfile > - Helm Chart > and we could also add more tests - for example testing different > deployment options for the helm chart. > > Having the Helm chart in Airflow repo would help with that - > especially in terms of future changes and testing them automatically. > > J. > > On Tue, Mar 24, 2020 at 9:09 PM Aizhamal Nurmamat kyzy > <aizha...@apache.org> wrote: > > > > +1 on the donation. Always happy to see more useful stuff for the > > community :) > > > > On Tue, Mar 24, 2020 at 9:20 AM Greg Neiheisel <g...@astronomer.io> > wrote: > > > > > Yep, the cleanup_pods script is set up now as an optional Kubernetes > > > CronJob ( > > > > https://github.com/astronomer/airflow-chart/blob/master/templates/cleanup/cleanup-cronjob.yaml > ) > > > that we have run periodically to clean failed pods up and could stay > > > separate. > > > > > > The wait_for_migrations script could definitely be pulled into Airflow. > > > For context, we deploy an initContainer on the scheduler ( > > > > https://github.com/astronomer/airflow-chart/blob/master/templates/scheduler/scheduler-deployment.yaml#L77-L84 > ) > > > that runs the upgradedb command before booting the scheduler. This new > > > wait_for_migration script runs in an initContainer on the webserver and > > > workers ( > > > > https://github.com/astronomer/airflow-chart/blob/master/templates/webserver/webserver-deployment.yaml#L58-L65 > ) > > > so that they don't boot up ahead of a potentially long-running > migration > > > and attempt to operate on new or missing columns/tables before the > > > migrations run. This prevents these pods from entering a CrashLoop. > > > > > > On Tue, Mar 24, 2020 at 11:48 AM Jarek Potiuk < > jarek.pot...@polidea.com> > > > wrote: > > > > > >> > > > >> > @Tomasz great question. Our images are currently generated from > > >> Dockerfiles > > >> > in this repo https://github.com/astronomer/ap-airflow and get > > >> published to > > >> > DockerHub > > >> > https://hub.docker.com/repository/docker/astronomerinc/ap-airflow. > > >> > > > >> > For the most part those are typical Airflow images. There's an > > >> entrypoint > > >> > script that we include in the image that handles waiting for the > > >> database > > >> > and redis (if used) to come up, which is pretty generic. > > >> > > >> > > >> I already added waiting for the database (both metadata and celery > URL) in > > >> the PR: > > >> > > >> > https://github.com/apache/airflow/pull/7832/files#diff-3759f40d4e8ba0c0e82e82b66d376741 > > >> . > > >> It's functionally the same but more generic. > > >> > > >> The only other > > >> > thing that I think the Helm Chart uses would be the scripts in this > repo > > >> > https://github.com/astronomer/astronomer-airflow-scripts. Our > > >> Dockerfiles > > >> > pull this package in. These scripts are used to coordinate running > > >> > migrations and cleaning up failed pods. > > >> > > > >> > > >> I see two scripts: > > >> > > >> * cleanup_pods -> this is (I believe) not needed to run in airflow - > this > > >> could be run as a separate pod/container? > > >> * waiting for migrations -> I think this is a good candidate to add > > >> *airflow > > >> db wait_for_migration* command and make it part of airflow itself. > > >> > > >> I think we also have to agree on the Airflow version supported by the > > >> official helm chart. I'd suggest we support 1.10.10+ and we > incorporate > > >> all > > >> the changes needed to airflow (like the "db wait_for_migration") > into 2.0 > > >> and 1.10 and we support both - image and helm chart for those versions > > >> only. That would help with people migrating to the latest version. > > >> > > >> WDYT? > > >> > > >> > > >> > On Tue, Mar 24, 2020 at 10:49 AM Daniel Imberman < > > >> > daniel.imber...@gmail.com> > > >> > wrote: > > >> > > > >> > > @jarek I agree completely. I think that pairing an official helm > chart > > >> > > with the official image would make for a REALLY powerful “up and > > >> running > > >> > > with airflow” story :). Tomek and I have also been looking into > > >> > > operator-sdk which has the ability to create custom controllers > from > > >> helm > > >> > > charts. We might even able to get a 1-2 punch from the same code > base > > >> :). > > >> > > > > >> > > @kaxil @jarek @aizhamal @ash if there’s no issues, can we please > start > > >> > the > > >> > > process of donation? > > >> > > > > >> > > +1 on my part, of course :) > > >> > > > > >> > > > > >> > > > > >> > > Daniel > > >> > > On Mar 24, 2020, 7:40 AM -0700, Jarek Potiuk < > > >> jarek.pot...@polidea.com>, > > >> > > wrote: > > >> > > > +1. And it should be paired with the official image we have > work in > > >> > > > progress on. I looked a lot at the Astronomer's image while > > >> preparing > > >> > my > > >> > > > draft and we can make any adjustments needed to make it works > with > > >> the > > >> > > helm > > >> > > > chart - and I am super happy to collaborate on that. > > >> > > > > > >> > > > PR here: https://github.com/apache/airflow/pull/7832 > > >> > > > > > >> > > > J. > > >> > > > > > >> > > > > > >> > > > On Tue, Mar 24, 2020 at 3:15 PM Kaxil Naik <kaxiln...@gmail.com > > > > >> > wrote: > > >> > > > > > >> > > > > @Tomasz Urbaszek <tomasz.urbas...@polidea.com> : > > >> > > > > Helm Chart Link: https://github.com/astronomer/airflow-chart > > >> > > > > > > >> > > > > On Tue, Mar 24, 2020 at 2:13 PM Tomasz Urbaszek < > > >> > turbas...@apache.org> > > >> > > > > wrote: > > >> > > > > > > >> > > > > > An official helm chart is something our community needs! > Using > > >> your > > >> > > > > > chart as the official makes a lot of sens to me because as > you > > >> > > > > > mentioned - it's battle tested. > > >> > > > > > > > >> > > > > > One question: what Airflow image do you use? Also, would you > > >> mind > > >> > > > > > sharing a link to the chart? > > >> > > > > > > > >> > > > > > Tomek > > >> > > > > > > > >> > > > > > > > >> > > > > > On Tue, Mar 24, 2020 at 2:07 PM Greg Neiheisel > > >> > > > > > <g...@astronomer.io.invalid> wrote: > > >> > > > > > > > > >> > > > > > > Hey everyone, > > >> > > > > > > > > >> > > > > > > Over the past few years at Astronomer, we’ve created, > managed, > > >> > and > > >> > > > > > hardened > > >> > > > > > > a production-ready Helm Chart for Airflow ( > > >> > > > > > > https://github.com/astronomer/airflow-chart) that is > being > > >> used > > >> > by > > >> > > > > both > > >> > > > > > our > > >> > > > > > > SaaS and Enterprise customers. This chart is > battle-tested and > > >> > > running > > >> > > > > > > hundreds of Airflow deployments of varying sizes and > runtime > > >> > > > > > environments. > > >> > > > > > > It’s been built up to encapsulate the issues that Airflow > > >> users > > >> > run > > >> > > > > into > > >> > > > > > in > > >> > > > > > > the real world. > > >> > > > > > > > > >> > > > > > > While this chart was originally developed internally for > our > > >> > > Astronomer > > >> > > > > > > Platform, we’ve recently decoupled the chart from the > rest of > > >> our > > >> > > > > > platform > > >> > > > > > > to make it usable by the greater Airflow community. With > these > > >> > > changes > > >> > > > > in > > >> > > > > > > mind, we want to start a conversation about donating this > > >> chart > > >> > to > > >> > > the > > >> > > > > > > Airflow community. > > >> > > > > > > > > >> > > > > > > Some of the main features of the chart are: > > >> > > > > > > > > >> > > > > > > - It works out of the box. With zero configuration, a user > > >> will > > >> > get > > >> > > > > a > > >> > > > > > > postgres database, a default user and the > KubernetesExecutor > > >> > ready > > >> > > > > to > > >> > > > > > run > > >> > > > > > > DAGs. > > >> > > > > > > - Support for Local, Celery (w/ optional KEDA > autoscaling) and > > >> > > > > > > Kubernetes executors. > > >> > > > > > > > > >> > > > > > > Support for optional pgbouncer. We use this to share a > > >> > configurable > > >> > > > > > > connection pool size per deployment. Useful for limiting > > >> > > connections to > > >> > > > > > the > > >> > > > > > > metadata database. > > >> > > > > > > > > >> > > > > > > - Airflow migration support. A user can push a newer > version > > >> of > > >> > > > > > Airflow > > >> > > > > > > into an existing release and migrations will > automatically run > > >> > > > > > cleanly. > > >> > > > > > > - Prometheus support. Optionally install and configure a > > >> > > > > > statsd-exporter > > >> > > > > > > to ingest Airflow metrics and expose them to Prometheus > > >> > > > > automatically. > > >> > > > > > > - Resource control. Optionally control the ResourceQuotas > and > > >> > > > > > > LimitRanges for each deployment so that no deployment can > > >> > overload > > >> > > a > > >> > > > > > > cluster. > > >> > > > > > > - Simple optional Elasticsearch support. > > >> > > > > > > - Optional namespace cleanup. Sometimes > KubernetesExecutor and > > >> > > > > > > KubernetesPodOperator pods fail for reasons other than the > > >> actual > > >> > > > > > task. > > >> > > > > > > This feature helps keep things clean in Kubernetes. > > >> > > > > > > - Support for running locally in KIND (Kubernetes in > Docker). > > >> > > > > > > - Automatically tested across many Kubernetes versions > with > > >> Helm > > >> > 2 > > >> > > > > > and 3 > > >> > > > > > > support. > > >> > > > > > > > > >> > > > > > > We’ve found that the cleanest and most reliable way to > deploy > > >> > DAGs > > >> > > to > > >> > > > > > > Kubernetes and manage them at scale is to package them > into > > >> the > > >> > > actual > > >> > > > > > > docker image, so we have geared this chart towards that > > >> method of > > >> > > > > > > operation, though adding other methods should be > > >> straightforward. > > >> > > > > > > > > >> > > > > > > We would love thoughts from the community and would love > to > > >> see > > >> > > this > > >> > > > > > chart > > >> > > > > > > help others to get up and running on Kubernetes! > > >> > > > > > > > > >> > > > > > > -- > > >> > > > > > > *Greg Neiheisel* / Chief Architect Astronomer.io > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > > >> > > > -- > > >> > > > > > >> > > > Jarek Potiuk > > >> > > > Polidea <https://www.polidea.com/> | Principal Software > Engineer > > >> > > > > > >> > > > M: +48 660 796 129 <+48%20660%20796%20129> <+48660796129 > > >> <+48%20660%20796%20129>> > > >> > > > [image: Polidea] <https://www.polidea.com/> > > >> > > > > >> > > > >> > > > >> > -- > > >> > *Greg Neiheisel* / Chief Architect Astronomer.io > > >> > > > >> > > >> > > >> -- > > >> > > >> Jarek Potiuk > > >> Polidea <https://www.polidea.com/> | Principal Software Engineer > > >> > > >> M: +48 660 796 129 <+48%20660%20796%20129> <+48660796129 > > >> <+48%20660%20796%20129>> > > >> [image: Polidea] <https://www.polidea.com/> > > >> > > > > > > > > > -- > > > *Greg Neiheisel* / Chief Architect Astronomer.io > > > > > > > -- > > Jarek Potiuk > Polidea | Principal Software Engineer > > M: +48 660 796 129 > -- *Greg Neiheisel* / Chief Architect Astronomer.io