Copilot commented on code in PR #67572:
URL: https://github.com/apache/airflow/pull/67572#discussion_r3306574975
##########
airflow-core/src/airflow/dag_processing/manager.py:
##########
@@ -1566,6 +1571,16 @@ def emit_metrics(*, parse_time: float, dag_file_stats:
Sequence[DagFileStat]):
stats.gauge("dag_processing.total_parse_time", parse_time)
stats.gauge("dagbag_size", sum(stat.num_dags for stat in dag_file_stats))
stats.gauge("dag_processing.import_errors", sum(stat.import_errors for
stat in dag_file_stats))
+ # COUNT(*) on the serialized_dag table adds one DB round-trip per parse
loop.
+ # This can be expensive on large deployments (query plan is DB-dependent
and
+ # may involve a full table scan). The call is isolated so that a transient
+ # DB error never kills the parse loop; throttling this metric in the future
+ # is a straightforward follow-up if the round-trip becomes a bottleneck.
+ try:
+ with create_session() as session:
+ stats.gauge("serialized_dag.count",
SerializedDagModel.get_count(session=session))
+ except OperationalError:
+ log.exception("Failed to emit serialized_dag.count metric")
Review Comment:
The comment/docstring intent is to prevent DB errors from killing the parse
loop, but this currently only swallows `OperationalError`. Other SQLAlchemy DB
failures (e.g. disconnects, timeouts mapped differently, or broader
`SQLAlchemyError` subclasses) can still bubble up and terminate the loop.
Consider catching `sqlalchemy.exc.SQLAlchemyError` instead (and importing
that), or update the comment/tests to explicitly state only `OperationalError`
is swallowed.
##########
airflow-core/src/airflow/dag_processing/manager.py:
##########
@@ -41,6 +41,7 @@
import attrs
import structlog
from sqlalchemy import select, update
+from sqlalchemy.exc import OperationalError
Review Comment:
The comment/docstring intent is to prevent DB errors from killing the parse
loop, but this currently only swallows `OperationalError`. Other SQLAlchemy DB
failures (e.g. disconnects, timeouts mapped differently, or broader
`SQLAlchemyError` subclasses) can still bubble up and terminate the loop.
Consider catching `sqlalchemy.exc.SQLAlchemyError` instead (and importing
that), or update the comment/tests to explicitly state only `OperationalError`
is swallowed.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]