Hello everyone,

With the release of Airflow 3 and the refactoring that moved
flask-appbuilder (FAB) into a separate provider, as well as recent work by
the FAB developers to support SQLAlchemy v2.0 (SQLA2), it is finally
possible to work on supporting SQLA2 in Airflow as well.

To get this process going, I added two CI tasks that install SQLA2 (PR
#52233) and am now slowly adding various workarounds to overcome repeated
test failures. After doing this for a while, there's a couple of discussion
points I want to raise:

   1. What is the plan regarding dual support of 1.4 & 2.x? If we want to
   have that, it might lead to pretty messy code (see details below).
   2. Should we work on supporting SQLA 2.0 or jump straight to 2.1 (which
   is currently in beta)?

I think we should decide on the refactoring principles and start moving in
that direction. With some luck, it might be possible to get this done in
time for 3.1.

I'd love your inputs on how to best approach this task. More importantly,
if anyone wants to participate - please leave your comment on my PR, and
let's coordinate. I suggest you use my branch as a starting point, since it
has the tests in place.

Best,
Dev-iL

P.S.
At the time of writing, the status of running the tests (via `breeze
testing core-tests --upgrade-sqlalchemy --maxfail=1000`) is: 134 failed,
6469 passed, 47 skipped, 8 xfailed, 1 warning
-------------------

SQLA2 introduced stricter type annotation requirements for ORM mapped
attributes. All mapped columns need to use the `Mapped[]` generic type
annotation:

<pre>
class TaskInstance(Base):
    __tablename__ = "task_instance"

    # Before (SQLA 1.4)
    # id = Column(Integer, primary_key=True)
    # dag_id = Column(String, ForeignKey("dag.id"))
    # dag_model = relationship("DagModel")

    # After (SQLA 2.0)
    id: Mapped[int] = mapped_column(Integer, primary_key=True)
    dag_id: Mapped[str] = mapped_column(String, ForeignKey("dag.id"))
    dag_model: Mapped["DagModel"] = relationship()
</pre>

To address differences of similar nature, and assuming we're interested in
dual SQLA version support, I can think of two extremes:

   1. We implement two copies of each model, one for each version of SQLA,
   and use the relevant one based on a runtime check.
   2. We use workarounds that disable the breaking features of SQLA2 (e.g.
   
https://github.com/apache/airflow/pull/52233/commits/90d738880553707fa01ee2a6b4acd40609e8cc2f)
   where possible, and resort to runtime branching where impossible, thus
   postponing the rewrite to whenever SQLA1.4 support will be dropped.

----------------------

References:

   1. Main SQLA2 support issue:
   https://github.com/apache/airflow/issues/28723
   2. My PR adding SQLA2 to the CI:
   https://github.com/apache/airflow/pull/52233
   3. FAB's SQLA2 support PR:
   https://github.com/dpgaspar/Flask-AppBuilder/pull/2241

Reply via email to