Hello everyone, With the release of Airflow 3 and the refactoring that moved flask-appbuilder (FAB) into a separate provider, as well as recent work by the FAB developers to support SQLAlchemy v2.0 (SQLA2), it is finally possible to work on supporting SQLA2 in Airflow as well.
To get this process going, I added two CI tasks that install SQLA2 (PR #52233) and am now slowly adding various workarounds to overcome repeated test failures. After doing this for a while, there's a couple of discussion points I want to raise: 1. What is the plan regarding dual support of 1.4 & 2.x? If we want to have that, it might lead to pretty messy code (see details below). 2. Should we work on supporting SQLA 2.0 or jump straight to 2.1 (which is currently in beta)? I think we should decide on the refactoring principles and start moving in that direction. With some luck, it might be possible to get this done in time for 3.1. I'd love your inputs on how to best approach this task. More importantly, if anyone wants to participate - please leave your comment on my PR, and let's coordinate. I suggest you use my branch as a starting point, since it has the tests in place. Best, Dev-iL P.S. At the time of writing, the status of running the tests (via `breeze testing core-tests --upgrade-sqlalchemy --maxfail=1000`) is: 134 failed, 6469 passed, 47 skipped, 8 xfailed, 1 warning ------------------- SQLA2 introduced stricter type annotation requirements for ORM mapped attributes. All mapped columns need to use the `Mapped[]` generic type annotation: <pre> class TaskInstance(Base): __tablename__ = "task_instance" # Before (SQLA 1.4) # id = Column(Integer, primary_key=True) # dag_id = Column(String, ForeignKey("dag.id")) # dag_model = relationship("DagModel") # After (SQLA 2.0) id: Mapped[int] = mapped_column(Integer, primary_key=True) dag_id: Mapped[str] = mapped_column(String, ForeignKey("dag.id")) dag_model: Mapped["DagModel"] = relationship() </pre> To address differences of similar nature, and assuming we're interested in dual SQLA version support, I can think of two extremes: 1. We implement two copies of each model, one for each version of SQLA, and use the relevant one based on a runtime check. 2. We use workarounds that disable the breaking features of SQLA2 (e.g. https://github.com/apache/airflow/pull/52233/commits/90d738880553707fa01ee2a6b4acd40609e8cc2f) where possible, and resort to runtime branching where impossible, thus postponing the rewrite to whenever SQLA1.4 support will be dropped. ---------------------- References: 1. Main SQLA2 support issue: https://github.com/apache/airflow/issues/28723 2. My PR adding SQLA2 to the CI: https://github.com/apache/airflow/pull/52233 3. FAB's SQLA2 support PR: https://github.com/dpgaspar/Flask-AppBuilder/pull/2241