1fanwang opened a new pull request, #66787:
URL: https://github.com/apache/airflow/pull/66787

   Triggering a Dag with an oversized `conf` payload currently produces a 
generic 500. The DagRun row is created in memory, the size error surfaces only 
at flush time deep in SQLAlchemy as `(1406, "Data too long for column 'conf' at 
row 1")` on MySQL, and the caller has no signal that conf size was the cause. 
Bug #14159 from 2021 covered the same crash class against the deprecated 
experimental API; the same failure mode is reproducible against the FastAPI 
public API today.
   
   This adds a JSON-size check at the trigger boundary so the request is 
rejected before the row reaches the DB, with a message that points at the right 
fix (XCom / Variables / external storage).
   
   ## Why
   
   Reproducer (against any 3.0+ deployment on MySQL with default 
`innodb_default_row_format`):
   
   ```
   curl -X POST $API/api/v2/dags/example_bash_operator/dagRuns \
     -H "Content-Type: application/json" \
     -d "{\"conf\":{\"k\":\"$(python -c 'print("x"*70000)')\"}}"
   ```
   
   Response: 500. Server log:
   
   ```
   sqlalchemy.exc.DataError: (pymysql.err.DataError)
   (1406, "Data too long for column 'conf' at row 1")
   [SQL: INSERT INTO dag_run (...) VALUES (...)]
   ```
   
   ## What
   
   - New `[core] max_dagrun_conf_size_bytes` (default 65535) bounds the 
JSON-encoded conf size. `0` disables the check.
   - New `airflow.exceptions.DagRunConfTooLargeError` with `status_code = 413` 
carries the measured size and limit.
   - `SerializedDAG.create_dagrun()` validates before insert via the new 
`validate_dagrun_conf_size()` helper, so the CLI and `TriggerDagRunOperator` 
paths get the same check as the REST API.
   - The FastAPI `POST /dags/{dag_id}/dagRuns` handler maps the exception to 
`413 Payload Too Large` with the actionable message.
   
   The default of 65535 fits the smallest MySQL `JSON` column variant; Postgres 
and larger MySQL row formats can raise it.
   
   ## Tests
   
   `airflow-core/tests/unit/models/test_dagrun.py::TestValidateDagRunConfSize` 
covers the helper (None / empty / at-limit / over-limit / disabled / multibyte 
UTF-8). 
`test_dag_run.py::TestTriggerDagRun::test_dagrun_creation_conf_too_large_returns_413`
 covers the route-level mapping to 413.
   
   ## Risk
   
   Backwards-incompatible only for deployments that today rely on MySQL 
silently rejecting oversized conf (those see a generic 500). The check is 
bounded by a config and can be disabled.
   
   Closes #66779
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to