Hello, lately in my organisation we have started to experience issues due to having to use airflow as a multi-tenant environment, due to having too many airflow environments to manage.
I have seen the same issue in other organisations as well, where one team had to deploy, monitor and upgrade dozens of airflow instances, which causes a lot of issues and complexity. After some thought, I came to the idea of supporting multi-tenant airflow clusters, as I know that now it is not supported and not recommended, however, In my opinion and from what I have seen, it would benefit many airflow clusters, and improve the usability and ease of maintenance operations. We, in our organisation, have a couple of possible propositions to allow airflow cluster multi-tenancy, which include: *1) In the airflow chart & code:* - Define the pools in the chart instead of the DB. - In the chart, set a new yaml array to define a tenant, whom consists of a list of pools. - For each said tenant, deploy a scheduler and a trigerrer (if needed). - Each deployed component only processes the related pools of the tenant. - Each connection or variable is changed to be accessed by a specific tenant (taken from the owner of the dag or any other way) *2) In the code only:* - Create a tenant table in the database. - Create the ralation for tenant and pool. - Make the connections and variables accessed from the tenant table, thus achieving isolation. - For each tenant, create at least #schedulers / 2 instances of the scheduler and triggerrer job (on the same pod). - Change the code of the scheduler and trigerrer so that every job only queries on the pools of the related tenants. These 2 issues would solve most of our issues that we have, such as starvation and noisy neighbours, keep in mind that these 2 are very very rough drafts, and are not the full spec of the idea, as I want to first keep this mail as a discussion rather than a proposal, in hopes that it will help me understand if an AIP can be opened on the thread. I would love to hear what the airflow community thinks about the topic, in addition to propositions or ideas on what can or should be done, and whether it may solve any issues that the community is experiencing.
