SameerMesiah97 commented on code in PR #61777:
URL: https://github.com/apache/airflow/pull/61777#discussion_r2795094967
##########
providers/common/io/src/airflow/providers/common/io/xcom/backend.py:
##########
@@ -171,8 +174,9 @@ def deserialize_value(result) -> Any:
return data
try:
with path.open(mode="rb", compression="infer") as f:
- return json.load(f, cls=XComDecoder)
- except (FileNotFoundError, TypeError, ValueError):
+ data = f.read()
+ return json.loads(data, cls=XComDecoder)
+ except (FileNotFoundError, TypeError, ValueError,
json.decoder.JSONDecodeError):
Review Comment:
What about binaries that are valid inputs for `json.loads`? For example,
`b'"Hello"'`. It seems these will not be stored as bytes. If that is indeed
the case, I would suggest adding a comment like below:
`# Prefer JSON decoding; fall back to raw bytes if it fails.`
##########
providers/common/io/src/airflow/providers/common/io/xcom/backend.py:
##########
@@ -118,8 +118,11 @@ def serialize_value( # type: ignore[override]
map_index: int | None = None,
) -> bytes | str:
# We will use this serialized value to write to the object store.
- s_val = json.dumps(value, cls=XComEncoder)
- s_val_encoded = s_val.encode("utf-8")
+ if not isinstance(value, bytes):
+ s_val = json.dumps(value, cls=XComEncoder)
+ s_val_encoded = s_val.encode("utf-8")
+ else:
+ s_val_encoded = value
Review Comment:
This is more of a stylistic nit that you don’t have to implement but since
raw bytes seems to be a special case, perhaps you could do:
```
if isinstance(value, bytes):
# Store raw bytes as-is
s_val_encoded = value
else:
# Default JSON serialization path
s_val = json.dumps(value, cls=XComEncoder)
s_val_encoded = s_val.encode("utf-8")
```
This is more explicit.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]