github-actions[bot] opened a new pull request, #67129:
URL: https://github.com/apache/airflow/pull/67129
* fix: migrate existing deadline rows in migration 0080 upgrade and downgrade
Both the upgrade and downgrade paths of migration 0080
(808787349f22 - Modify deadline callback schema) added NOT NULL columns
to the deadline table without first populating them from the existing
data, causing:
* upgrade: NotNullViolation when adding callback JSON NOT NULL to a
non-empty table (existing rows have no value for the new
column).
* downgrade: NotNullViolation on PostgreSQL / silent NULL on MySQL when
adding callback VARCHAR(500) NOT NULL after dropping the
JSON column, crashing the subsequent 0094 upgrade with
json.loads(None).
Fix both paths with the same pattern used by migration 0094:
1. Read the existing data before any schema change.
2. Add the new column(s) as nullable so the DDL succeeds on all
supported databases.
3. Back-fill the column using a typed SA table clause (so each dialect
handles JSON serialisation correctly).
4. Enforce NOT NULL only after every row has a valid value.
Upgrade serialises the old (path, kwargs) pair into the
{"__data__": {"path": ..., "kwargs": ...}, "__classname__": ...,
"__version__": 0} format expected by migration 0094.
Downgrade extracts path/kwargs back from that same JSON envelope
before restoring the VARCHAR callback column.
Made-with: Cursor
Co-authored-by: Cursor <[email protected]>
* fix(deadline): repair NULL callback rows in 0094 upgrade
Legacy MySQL deployments that ran the original (pre-#66016) 0080
migration silently wrote NULL into deadline.callback. Alembic is already
stamped past 808787349f22 on those deployments, so the fixed 0080 won't
re-run -- 0094 then crashes on json.loads(None).
Defensive handling in 0094:
- _upgrade_mysql_sqlite: detect raw_cb is None, log warning, default to
empty envelope.
- _upgrade_postgresql: COALESCE on cb_path / cb_kwargs so NULL jsonb
doesn't propagate into callback.data.
Regression test starts from a post-0080 schema with callback=NULL and
verifies 0094 produces an empty envelope without crashing, plus that a
mixed NULL+valid batch migrates both rows correctly.
Addresses ephraimbuddy's review comment on #66016.
* fix(deadline): address review nits
- 0094 PG path: wrap kwargs COALESCE with NULLIF so JSON-literal null
(e.g. from hand-edited DBs) is also normalized to {} -- matches the
MySQL/SQLite defensive normalization.
- 0094 MySQL/SQLite: aggregate per-row NULL-callback WARNING into a
single summary line after the loop to avoid log spam on deployments
with many legacy NULL rows.
- 0080 downgrade: log a WARNING when an existing row's kwargs is not a
dict (instead of silently resetting), mirroring the VARCHAR truncate
warning pattern.
- 0080 tests: add test_upgrade_exact_batch_boundary covering the case
where rows == batch_size to exercise the loop-continuation path.
* test(deadline): insert deadline ids in hex form to avoid SA Uuid type
mismatch
`_upgrade_mysql_sqlite` declares the deadline `id` column as `sa.Uuid()`.
On SQLite the SQLAlchemy write path serializes UUID objects as a hex string
without dashes. The previous test inserted ids via `str(uuid.uuid4())`
(dashed form), so the SELECT correctly parsed them back to UUID objects
but the subsequent UPDATE's WHERE clause produced the hex form -- no rows
matched, the WHERE callback_id IS NULL filter never narrowed, and the
loop spun until the 60s execution timeout.
Insert ids via `uuid.uuid4().hex` so read and write round-trip cleanly.
* fix(deadline): drop redundant empty-rows break in 0080 batch loops
* fix(deadline): restore empty-rows break in 0080 batch loops
Without this guard, an empty deadline table causes executemany with an
empty parameter list, which SQLAlchemy rejects with 'A value is required
for bind parameter'. Reverts the removal from e50f106f.
---------
(cherry picked from commit c8a6c55cac8031c4131b5b0e064c45cb232b9966)
Co-authored-by: Rahul Vats <[email protected]>
Co-authored-by: Cursor <[email protected]>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]