Asquator commented on code in PR #53492:
URL: https://github.com/apache/airflow/pull/53492#discussion_r2252151781


##########
airflow-core/src/airflow/migrations/versions/0080_3_1_0_add_ti_max_active_tis_per_dagrun.py:
##########
@@ -0,0 +1,57 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""
+Add callback_state to deadline.
+
+Revision ID: 2f49f2dae90c
+Revises: f56f68b9e02f
+Create Date: 2025-07-28 16:39:01.181132
+"""
+
+from __future__ import annotations
+
+import sqlalchemy as sa
+from alembic import op
+
+# revision identifiers, used by Alembic.
+revision = "2f49f2dae90c"
+down_revision = "f56f68b9e02f"
+branch_labels = None
+depends_on = None
+airflow_version = "3.1.0"
+
+
+def upgrade():
+    """Add callback_state to deadline."""
+    with op.batch_alter_table("task_instance", schema=None) as batch_op:
+        batch_op.add_column(sa.Column("max_active_tis_per_dag", sa.Integer, 
nullable=True))

Review Comment:
   `max_active_tis_per_dag` and `max_active_tis_per_dagrun` are a per-task 
parameters. They can't be stored in the dag_run table because different tasks 
may have different values for these fields. The overhead is clear, and it's 
even more evident with mapped tasks, where you can have thousands of tasks 
duplicated with same values. I agree it's bad, and I'd like to hear from 
Airflow experts who specialize in DB performance. Another discussion will be 
the need to normalize the `task_instance` table and maybe derive the `task` 
table. It's a totally different change that deserves an AIP of its own, because 
it means tasks become standalone objects. I think I can see the good and the 
very good sides of it, but it's a completely different topic.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to