amoghrajesh commented on code in PR #68133:
URL: https://github.com/apache/airflow/pull/68133#discussion_r3368872148
##########
airflow-core/src/airflow/api_fastapi/core_api/datamodels/asset_store.py:
##########
@@ -67,6 +66,10 @@ def value_is_json_representable(cls, v: JsonValue) ->
JsonValue:
serialized = json.dumps(v, allow_nan=False)
except ValueError:
raise ValueError("value contains non-finite numbers; NaN and Inf
are not JSON representable")
- if len(serialized) > _MAX_SERIALIZED_BYTES:
- raise ValueError(f"value exceeds maximum serialized size of
{_MAX_SERIALIZED_BYTES} bytes")
+ limit = conf.getint("state_store", "max_value_storage_bytes")
+ if limit > 0 and len(serialized) > limit:
+ raise ValueError(
+ f"value exceeds max_value_storage_bytes ({limit}); "
+ "for large payloads configure a custom [state_store] backend"
+ )
Review Comment:
xcom does not enforce a size limit in the model, and the ObjectStorage XCom
backend has `xcom_objectstorage_threshold` which tiers large values to object
storage rather than rejecting them.
For task state, we have a clearer use case than xcom did at design time
itself which is coordination state (job IDs, cursors, retry counts), not data
payloads. A configurable default cap lets us nudge operators toward a custom
backend before they accidentally bombard the DB with large values which is
something xcom learned from retroactively. The `65535` default can be raised as
needed via [state_store] max_value_storage_bytes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]