Copilot commented on code in PR #60562:
URL: https://github.com/apache/airflow/pull/60562#discussion_r2692951549
##########
task-sdk/src/airflow/sdk/definitions/asset/__init__.py:
##########
@@ -378,7 +379,7 @@ def __eq__(self, other: Any) -> bool:
def __hash__(self):
f = attrs.filters.include(*attrs.fields_dict(Asset))
- return hash(attrs.asdict(self, filter=f))
+ return hash(json.dumps(attrs.asdict(self, filter=f), sort_keys=True))
Review Comment:
The `attrs.asdict()` call will recursively convert all nested attrs objects
(including `AssetWatcher` instances with their `trigger` field) to
dictionaries. However, the `BaseEventTrigger` objects in the `watchers` field
may not be JSON serializable by default, which could cause a `TypeError` when
`json.dumps()` is called. Consider adding `recurse=False` to `attrs.asdict()`
or implementing a custom value serializer that handles non-JSON-serializable
objects, or document that Assets with watchers cannot be hashed.
```suggestion
data = attrs.asdict(self, filter=f)
try:
serialized = json.dumps(data, sort_keys=True)
except TypeError:
# Fallback for non-JSON-serializable values (e.g.
BaseEventTrigger instances).
# This preserves hashing capability instead of raising at
runtime.
serialized = repr(data)
return hash(serialized)
```
##########
task-sdk/tests/task_sdk/definitions/test_asset.py:
##########
@@ -192,6 +192,19 @@ def test_not_equal_when_different_uri():
assert asset1 != asset2
+def test_hash_for_same_uri():
+ asset1 = Asset(uri="s3://example/asset")
+ asset2 = Asset(uri="s3://example/asset")
+
+ assert hash(asset1) == hash(asset2)
Review Comment:
The test cases only verify hash behavior with basic Assets that have default
values for `extra` and `watchers`. Add test cases that verify hash behavior
when Assets have non-empty `extra` dictionaries and when Assets have
identical/different `extra` values to ensure the JSON serialization handles
these fields correctly.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]