Ah, I guess I am a little bit late now. In terms of the
> Isolating code execution and parsing of DAG files. > In Airbnb, we use `Docker runtime isolation for airflow tasks` plus `Parsing Service` to totally isolate the dag parsing, task execution from the airflow infra runtime. `Docker runtime isolation for airflow tasks` (see email thread: [DISCUSS] Docker runtime isolation for airflow tasks) introduces a docker layer which wraps the dag parsing and task execution so each dag file can have its own docker runtime. `Parsing Service` totally removes the dag file parsing from the scheduler. These two features have been running in Airbnb's production for close to 1 year. I am working on open source them. Ping Best wishes Ping Zhang On Wed, Dec 1, 2021 at 11:43 AM Jarek Potiuk <[email protected]> wrote: > Very good/important thoughts. > > From the discussions and looking at the (upcoming) proposals from > Mateusz we are going to have this all optional: > > We plan to have two config options: > > * DB Isolation mode for separating out DB access > * Standalone DAG processor > > I totally agree that standalone/quick/dirty access mode for Airflow > should be the default (so business as usual). Moreover - that will > allow the introduction of the multi-tenant mode as "optional" in > otherwise backwards-compatible Airflow - i.e. it could start to be > available in 2.x line. > > Actually (and this is something up for discussion in the AIP) we could > introduce "soft" multi-tenancy mode, where DB access will be still > possible but flagged as a warning. > This could give the user an option to switch gradually their DAGs to > the multi-tenancy mode, if they are already using some direct DB > access (for example in their callbacks or custom operator). > > Also I think part of the AIP and proof of concept while discussing it > should be initially rough, and later more comprehensive performance > testing of some "real-life" scenarios. > > J. > > On Wed, Dec 1, 2021 at 6:12 PM Ash Berlin-Taylor <[email protected]> wrote: > > > > I look forward to seeing these propsals etc. > > > > One thought I've just had is that we should be careful about two things > when taking on this work: > > > > 1. That performance is not impacted (specifically of the scheduler > "throughput") -- at least when only a single "tenant" is in use if not for > all. > > 2. That we don't make the deployment story more complex for the small > deployments, nor for the "getting started on a laptop" initial user > experience. > > > > -ash > > > > On Fri, Nov 26 2021 at 18:23:32 +0100, Jarek Potiuk <[email protected]> > wrote: > > > > Recording available here: > https://drive.google.com/file/d/1Irw7qxxeTOHZTfdvT5lAbGowIfm9DHzi/view On > Fri, Nov 26, 2021 at 6:17 PM Jarek Potiuk <[email protected]> wrote: > > > > Thanks for the meeting this morning/afternoon :) ! It was very > productive, I believe: The notes are available here: > https://docs.google.com/document/d/19d0jQeARnWm8VTVc_qSIEDldQ81acRk0KuhVwAd96wo/edit > The most important take is that it looks like if the use cases are slightly > different, we are all aligned of what needs to be done and how Action > points: * Composer team (Mateusz) will soon submit AIP's (they are close to > be ready for proposing) for * DB access isolation * Separating out DAG > processor * Cloudera team (Ian) will work on follow-up Fine-grained > resource access AIP - it can be implemented as next steps. The two AIPs > above will implement "coarse" access level but in the way that the > "fine-grained" access will be possible to be plugged-in I recorded the > meeting and I am waiting for the video to be processed - I will send/add it > to notes when I get it. J. J. On Fri, Nov 26, 2021 at 2:29 PM Jarek Potiuk < > [email protected]> wrote: > > Reminder: the SIG meeting is today in ~2.5 > hrs. > > Calendar link here: > > https://calendar.google.com/event?action=TEMPLATE&tmeid=N3ZmbGFxNGF1OXBtajc2ODU3bWduMWVvc2YgcG90aXVrLmFwYWNoZS5vcmdAbQ&tmsrc=potiuk.apache.org%40gmail.com > > Notes/material links will be added here > > https://docs.google.com/document/d/19d0jQeARnWm8VTVc_qSIEDldQ81acRk0KuhVwAd96wo/edit?usp=sharing > > > I will record the meeting and post the link together with the notes. > > > On Thu, Nov 25, 2021 at 3:31 PM Jarek Potiuk <[email protected]> wrote: > > > > > Just a reminder -> multi-tenancy meeting tomorrow. Few people > worked > > on what will be presented tomorrow, and I am super excited we > will be > > able to kick that one off - it has been a long time on my > waiting list > > :) > > > > J. > > > > On Sat, Nov 20, 2021 at 10:14 AM > Jarek Potiuk <[email protected]> wrote: > > > > > > The meeting is set for > Friday 26th Nov 5 PM CET (4 PM UTC) > > > > > > This is the calendar link > (google meet link there): > > > > https://calendar.google.com/event?action=TEMPLATE&tmeid=N3ZmbGFxNGF1OXBtajc2ODU3bWduMWVvc2YgcG90aXVrLmFwYWNoZS5vcmdAbQ&tmsrc=potiuk.apache.org%40gmail.com > > > > > > > The initial agenda: > > > > > > 1) The goal of the group, intro > about the "isolation" and various "scopes" > > > of the multi-tenancy - > Jarek Potiuk > > > > > > 2) The review of the example architecture that > > > > needs the "multitenancy" - this is from the Google Composer team - > > > > Mateusz Henc > > > > > > 3) Maybe others would like to get their case > explain similarly > > > > > > 4) Discus proposals on the scope of the > AIP(s) we want to write > > > and rough approach we can take for > implementation and who will do > > > whatGoogle Meet call: > meet.google.com/rxu-tvdz-vpv (edited) > > > > > > We will send more > info/slides then. Anyone who would like to show/add > > > something, please > respond here :). > > > > > > J. >
