zachliu commented on PR #68268: URL: https://github.com/apache/airflow/pull/68268#issuecomment-4671726163
+1 on this! we just had a production incident that shows why blocking the save matters. A malformed Variable (a single trailing comma in JSON) got saved through this form because the `Invalid JSON` warning is easy to miss and `Save` stays enabled. The variable drives parse-time task generation in many of our DAGs, so on the next parse those DAGs serialized with zero tasks, and every scheduled run silently `succeeded` in ~0.1s with all task instances marked removed. Blast radius was far larger than the one variable suggested, and root-causing it took a while because nothing actually errored. Guarding `Save` behind JSON validation, as in this PR, would have stopped it at the source. One tangential note in case it's useful: even after we fixed the variable, tasks defined inside a `TaskGroup` stayed `removed`, while bare top-level tasks recovered on reparse. We only got them back by forcing a new serialized DAG version (a structural change) and clearing the runs. That smells like a possible `TaskGroup` reconciliation bug in Airflow 3.x, separate from this UI fix, happy to file a dedicated issue if it'd help. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
