Copilot commented on code in PR #67572:
URL: https://github.com/apache/airflow/pull/67572#discussion_r3306574975


##########
airflow-core/src/airflow/dag_processing/manager.py:
##########
@@ -1566,6 +1571,16 @@ def emit_metrics(*, parse_time: float, dag_file_stats: 
Sequence[DagFileStat]):
     stats.gauge("dag_processing.total_parse_time", parse_time)
     stats.gauge("dagbag_size", sum(stat.num_dags for stat in dag_file_stats))
     stats.gauge("dag_processing.import_errors", sum(stat.import_errors for 
stat in dag_file_stats))
+    # COUNT(*) on the serialized_dag table adds one DB round-trip per parse 
loop.
+    # This can be expensive on large deployments (query plan is DB-dependent 
and
+    # may involve a full table scan).  The call is isolated so that a transient
+    # DB error never kills the parse loop; throttling this metric in the future
+    # is a straightforward follow-up if the round-trip becomes a bottleneck.
+    try:
+        with create_session() as session:
+            stats.gauge("serialized_dag.count", 
SerializedDagModel.get_count(session=session))
+    except OperationalError:
+        log.exception("Failed to emit serialized_dag.count metric")

Review Comment:
   The comment/docstring intent is to prevent DB errors from killing the parse 
loop, but this currently only swallows `OperationalError`. Other SQLAlchemy DB 
failures (e.g. disconnects, timeouts mapped differently, or broader 
`SQLAlchemyError` subclasses) can still bubble up and terminate the loop. 
Consider catching `sqlalchemy.exc.SQLAlchemyError` instead (and importing 
that), or update the comment/tests to explicitly state only `OperationalError` 
is swallowed.



##########
airflow-core/src/airflow/dag_processing/manager.py:
##########
@@ -41,6 +41,7 @@
 import attrs
 import structlog
 from sqlalchemy import select, update
+from sqlalchemy.exc import OperationalError

Review Comment:
   The comment/docstring intent is to prevent DB errors from killing the parse 
loop, but this currently only swallows `OperationalError`. Other SQLAlchemy DB 
failures (e.g. disconnects, timeouts mapped differently, or broader 
`SQLAlchemyError` subclasses) can still bubble up and terminate the loop. 
Consider catching `sqlalchemy.exc.SQLAlchemyError` instead (and importing 
that), or update the comment/tests to explicitly state only `OperationalError` 
is swallowed.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to