<3

On Thu, Mar 26, 2020 at 3:59 PM Daniel Imberman <daniel.imber...@gmail.com>
wrote:

> @Jarek I think with the helm chart + prod image we can go even further
> than that :). We can test CeleryExecutor, with KEDA autoscaling, and a
> bunch of other configurations.
> On Mar 26, 2020, 7:45 AM -0700, Jarek Potiuk <jarek.pot...@polidea.com>,
> wrote:
>
> Yeah. I meant Custom Resources  not CRDs in my original email :)
>
> On Thu, Mar 26, 2020 at 3:38 PM Daniel Imberman <daniel.imber...@gmail.com>
> wrote:
>
>> We’re not using CRDs for the tests at the moment. We just have deployment
>> files. If anything having the helm chart as a part of the airflow repo
>> could mean that the helm chart becomes the defacto system for testing
>> airflow on kubernetes (we can get rid of all the yams files and run
>> multiple k8s tests with different settings).
>> On Mar 26, 2020, 7:20 AM -0700, Greg Neiheisel <g...@astronomer.io.invalid>,
>> wrote:
>>
>> Yep, we can absolutely pull it into the airflow repo. We've also been
>> building up a test suite that currently runs on CircleCI and uses kind
>> (Kubernetes in Docker) to test several kubernetes versions with some
>> different settings. Right now we're mostly testing the different executors
>> since that has the biggest impact on what gets deployed, but that can be
>> expanded.
>>
>> What CRDs are currently being used to run Airflow for the tests?
>>
>> On Wed, Mar 25, 2020 at 11:06 AM Jarek Potiuk <jarek.pot...@polidea.com>
>> wrote:
>>
>> One thing for the donation.
>>
>> Did you you want to have separate repository Greg ?
>>
>> I think we should simply create a folder in Airflow repo and keep it
>> there (similarly as we keep Dockerfile). I am going to switch our
>> Kubernetes Tests to the production image (will make the tests much
>> faster) and I am going to test the Dockerfile automatically in CI -
>> for now we are using some custom Resource definitions to start Airflow
>> on Kubernetes Cluster for the tests, but we could switch to using the
>> helm chart - this way we can test all three things at once:
>> - Kubernetes Executor
>> - Dockerfile
>> - Helm Chart
>> and we could also add more tests - for example testing different
>> deployment options for the helm chart.
>>
>> Having the Helm chart in Airflow repo would help with that -
>> especially in terms of future changes and testing them automatically.
>>
>> J.
>>
>> On Tue, Mar 24, 2020 at 9:09 PM Aizhamal Nurmamat kyzy
>> <aizha...@apache.org> wrote:
>>
>>
>> +1 on the donation. Always happy to see more useful stuff for the
>> community :)
>>
>> On Tue, Mar 24, 2020 at 9:20 AM Greg Neiheisel <g...@astronomer.io>
>>
>> wrote:
>>
>>
>> Yep, the cleanup_pods script is set up now as an optional Kubernetes
>> CronJob (
>>
>>
>> https://github.com/astronomer/airflow-chart/blob/master/templates/cleanup/cleanup-cronjob.yaml
>> )
>>
>> that we have run periodically to clean failed pods up and could stay
>> separate.
>>
>> The wait_for_migrations script could definitely be pulled into Airflow.
>> For context, we deploy an initContainer on the scheduler (
>>
>>
>> https://github.com/astronomer/airflow-chart/blob/master/templates/scheduler/scheduler-deployment.yaml#L77-L84
>> )
>>
>> that runs the upgradedb command before booting the scheduler. This new
>> wait_for_migration script runs in an initContainer on the webserver and
>> workers (
>>
>>
>> https://github.com/astronomer/airflow-chart/blob/master/templates/webserver/webserver-deployment.yaml#L58-L65
>> )
>>
>> so that they don't boot up ahead of a potentially long-running
>>
>> migration
>>
>> and attempt to operate on new or missing columns/tables before the
>> migrations run. This prevents these pods from entering a CrashLoop.
>>
>> On Tue, Mar 24, 2020 at 11:48 AM Jarek Potiuk <
>>
>> jarek.pot...@polidea.com>
>>
>> wrote:
>>
>>
>> @Tomasz great question. Our images are currently generated from
>>
>> Dockerfiles
>>
>> in this repo https://github.com/astronomer/ap-airflow and get
>>
>> published to
>>
>> DockerHub
>> https://hub.docker.com/repository/docker/astronomerinc/ap-airflow.
>>
>> For the most part those are typical Airflow images. There's an
>>
>> entrypoint
>>
>> script that we include in the image that handles waiting for the
>>
>> database
>>
>> and redis (if used) to come up, which is pretty generic.
>>
>>
>>
>> I already added waiting for the database (both metadata and celery
>>
>> URL) in
>>
>> the PR:
>>
>>
>>
>> https://github.com/apache/airflow/pull/7832/files#diff-3759f40d4e8ba0c0e82e82b66d376741
>>
>> .
>> It's functionally the same but more generic.
>>
>> The only other
>>
>> thing that I think the Helm Chart uses would be the scripts in this
>>
>> repo
>>
>> https://github.com/astronomer/astronomer-airflow-scripts. Our
>>
>> Dockerfiles
>>
>> pull this package in. These scripts are used to coordinate running
>> migrations and cleaning up failed pods.
>>
>>
>> I see two scripts:
>>
>> * cleanup_pods -> this is (I believe) not needed to run in airflow -
>>
>> this
>>
>> could be run as a separate pod/container?
>> * waiting for migrations -> I think this is a good candidate to add
>> *airflow
>> db wait_for_migration* command and make it part of airflow itself.
>>
>> I think we also have to agree on the Airflow version supported by the
>> official helm chart. I'd suggest we support 1.10.10+ and we
>>
>> incorporate
>>
>> all
>> the changes needed to airflow (like the "db wait_for_migration")
>>
>> into 2.0
>>
>> and 1.10 and we support both - image and helm chart for those versions
>> only. That would help with people migrating to the latest version.
>>
>> WDYT?
>>
>>
>> On Tue, Mar 24, 2020 at 10:49 AM Daniel Imberman <
>> daniel.imber...@gmail.com>
>> wrote:
>>
>> @jarek I agree completely. I think that pairing an official helm
>>
>> chart
>>
>> with the official image would make for a REALLY powerful “up and
>>
>> running
>>
>> with airflow” story :). Tomek and I have also been looking into
>> operator-sdk which has the ability to create custom controllers
>>
>> from
>>
>> helm
>>
>> charts. We might even able to get a 1-2 punch from the same code
>>
>> base
>>
>> :).
>>
>>
>> @kaxil @jarek @aizhamal @ash if there’s no issues, can we please
>>
>> start
>>
>> the
>>
>> process of donation?
>>
>> +1 on my part, of course :)
>>
>>
>>
>> Daniel
>> On Mar 24, 2020, 7:40 AM -0700, Jarek Potiuk <
>>
>> jarek.pot...@polidea.com>,
>>
>> wrote:
>>
>> +1. And it should be paired with the official image we have
>>
>> work in
>>
>> progress on. I looked a lot at the Astronomer's image while
>>
>> preparing
>>
>> my
>>
>> draft and we can make any adjustments needed to make it works
>>
>> with
>>
>> the
>>
>> helm
>>
>> chart - and I am super happy to collaborate on that.
>>
>> PR here: https://github.com/apache/airflow/pull/7832
>>
>> J.
>>
>>
>> On Tue, Mar 24, 2020 at 3:15 PM Kaxil Naik <kaxiln...@gmail.com
>>
>>
>> wrote:
>>
>>
>> @Tomasz Urbaszek <tomasz.urbas...@polidea.com> :
>> Helm Chart Link: https://github.com/astronomer/airflow-chart
>>
>> On Tue, Mar 24, 2020 at 2:13 PM Tomasz Urbaszek <
>>
>> turbas...@apache.org>
>>
>> wrote:
>>
>> An official helm chart is something our community needs!
>>
>> Using
>>
>> your
>>
>> chart as the official makes a lot of sens to me because as
>>
>> you
>>
>> mentioned - it's battle tested.
>>
>> One question: what Airflow image do you use? Also, would you
>>
>> mind
>>
>> sharing a link to the chart?
>>
>> Tomek
>>
>>
>> On Tue, Mar 24, 2020 at 2:07 PM Greg Neiheisel
>> <g...@astronomer.io.invalid> wrote:
>>
>>
>> Hey everyone,
>>
>> Over the past few years at Astronomer, we’ve created,
>>
>> managed,
>>
>> and
>>
>> hardened
>>
>> a production-ready Helm Chart for Airflow (
>> https://github.com/astronomer/airflow-chart) that is
>>
>> being
>>
>> used
>>
>> by
>>
>> both
>>
>> our
>>
>> SaaS and Enterprise customers. This chart is
>>
>> battle-tested and
>>
>> running
>>
>> hundreds of Airflow deployments of varying sizes and
>>
>> runtime
>>
>> environments.
>>
>> It’s been built up to encapsulate the issues that Airflow
>>
>> users
>>
>> run
>>
>> into
>>
>> in
>>
>> the real world.
>>
>> While this chart was originally developed internally for
>>
>> our
>>
>> Astronomer
>>
>> Platform, we’ve recently decoupled the chart from the
>>
>> rest of
>>
>> our
>>
>> platform
>>
>> to make it usable by the greater Airflow community. With
>>
>> these
>>
>> changes
>>
>> in
>>
>> mind, we want to start a conversation about donating this
>>
>> chart
>>
>> to
>>
>> the
>>
>> Airflow community.
>>
>> Some of the main features of the chart are:
>>
>> - It works out of the box. With zero configuration, a user
>>
>> will
>>
>> get
>>
>> a
>>
>> postgres database, a default user and the
>>
>> KubernetesExecutor
>>
>> ready
>>
>> to
>>
>> run
>>
>> DAGs.
>> - Support for Local, Celery (w/ optional KEDA
>>
>> autoscaling) and
>>
>> Kubernetes executors.
>>
>> Support for optional pgbouncer. We use this to share a
>>
>> configurable
>>
>> connection pool size per deployment. Useful for limiting
>>
>> connections to
>>
>> the
>>
>> metadata database.
>>
>> - Airflow migration support. A user can push a newer
>>
>> version
>>
>> of
>>
>> Airflow
>>
>> into an existing release and migrations will
>>
>> automatically run
>>
>> cleanly.
>>
>> - Prometheus support. Optionally install and configure a
>>
>> statsd-exporter
>>
>> to ingest Airflow metrics and expose them to Prometheus
>>
>> automatically.
>>
>> - Resource control. Optionally control the ResourceQuotas
>>
>> and
>>
>> LimitRanges for each deployment so that no deployment can
>>
>> overload
>>
>> a
>>
>> cluster.
>> - Simple optional Elasticsearch support.
>> - Optional namespace cleanup. Sometimes
>>
>> KubernetesExecutor and
>>
>> KubernetesPodOperator pods fail for reasons other than the
>>
>> actual
>>
>> task.
>>
>> This feature helps keep things clean in Kubernetes.
>> - Support for running locally in KIND (Kubernetes in
>>
>> Docker).
>>
>> - Automatically tested across many Kubernetes versions
>>
>> with
>>
>> Helm
>>
>> 2
>>
>> and 3
>>
>> support.
>>
>> We’ve found that the cleanest and most reliable way to
>>
>> deploy
>>
>> DAGs
>>
>> to
>>
>> Kubernetes and manage them at scale is to package them
>>
>> into
>>
>> the
>>
>> actual
>>
>> docker image, so we have geared this chart towards that
>>
>> method of
>>
>> operation, though adding other methods should be
>>
>> straightforward.
>>
>>
>> We would love thoughts from the community and would love
>>
>> to
>>
>> see
>>
>> this
>>
>> chart
>>
>> help others to get up and running on Kubernetes!
>>
>> --
>> *Greg Neiheisel* / Chief Architect Astronomer.io
>>
>>
>>
>>
>>
>> --
>>
>> Jarek Potiuk
>> Polidea <https://www.polidea.com/> | Principal Software
>>
>> Engineer
>>
>>
>> M: +48 660 796 129 <+48%20660%20796%20129> <+48660796129
>>
>> <+48%20660%20796%20129>>
>>
>> [image: Polidea] <https://www.polidea.com/>
>>
>>
>>
>>
>> --
>> *Greg Neiheisel* / Chief Architect Astronomer.io
>>
>>
>>
>> --
>>
>> Jarek Potiuk
>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>
>> M: +48 660 796 129 <+48%20660%20796%20129> <+48660796129
>> <+48%20660%20796%20129>>
>> [image: Polidea] <https://www.polidea.com/>
>>
>>
>>
>> --
>> *Greg Neiheisel* / Chief Architect Astronomer.io
>>
>>
>>
>>
>> --
>>
>> Jarek Potiuk
>> Polidea | Principal Software Engineer
>>
>> M: +48 660 796 129
>>
>>
>>
>> --
>> *Greg Neiheisel* / Chief Architect Astronomer.io
>>
>>
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] <https://www.polidea.com/>
>
>

-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Reply via email to