jykae commented on code in PR #68074: URL: https://github.com/apache/airflow/pull/68074#discussion_r3387219598
########## chart/files/db_migrate.py: ########## @@ -0,0 +1,326 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +"""Bidirectional Airflow metadata DB reconciliation for the helm chart. + +Decides at runtime whether the helm release wants a forward migrate, a +downgrade, or a no-op, and runs the right command: + +* target == current -> no-op (idempotent check) +* target > current -> ``airflow db migrate`` inside this job's container + (uses the TARGET image, which ships forward scripts). +* target < current -> ``airflow db downgrade --to-version <target>`` + executed inside the still-running api-server pod (the OLD image still + ships the reverse scripts), followed by scaling every DB-touching + workload (api-server, scheduler, triggerer, dag-processor, worker) to + zero so that no OLD pod keeps talking to the now-downgraded schema. Helm + then patches those workloads back to ``replicas: N`` with the TARGET + image as the upgrade proceeds, so the cluster comes back up cleanly on + the target version. This means a downgrade trades the otherwise-broken + rolling-update window for a brief outage (which is unavoidable when the + schema goes backwards). + +Required env: + +* ``AIRFLOW_TARGET_VERSION`` - the version the chart is being upgraded/installed to. +* ``POD_NAMESPACE`` - release namespace, injected via downward API. +* ``RELEASE_NAME`` - the helm release name, used to scope the scale-down to + only the workloads owned by this release. + +Optional env: + +* ``MIGRATE_JOB_DRAIN_TIMEOUT_SECONDS`` - how long to wait for DB-touching + pods to terminate after scale-to-zero on a downgrade. Defaults to 300. + Increase when long-running worker tasks need more time to wind down. +""" + +from __future__ import annotations + +import os +import subprocess +import sys +import time + +from alembic.migration import MigrationContext +from kubernetes import client, config as k8s_config +from kubernetes.stream import stream +from packaging.version import Version Review Comment: Done. The `kubernetes` import is now lazy: it's wrapped in a module-level `try/except ImportError` guard, and only the downgrade branch calls `_require_kubernetes_client()`, which raises a clear `SystemExit` if the client is missing. The forward/no-op/fresh paths no longer require the `kubernetes` package. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
