Nataneljpwd commented on code in PR #61777:
URL: https://github.com/apache/airflow/pull/61777#discussion_r2802926706
##########
providers/common/io/src/airflow/providers/common/io/xcom/backend.py:
##########
@@ -117,9 +117,13 @@ def serialize_value( # type: ignore[override]
run_id: str | None = None,
map_index: int | None = None,
) -> bytes | str:
- # We will use this serialized value to write to the object store.
- s_val = json.dumps(value, cls=XComEncoder)
- s_val_encoded = s_val.encode("utf-8")
+ if isinstance(value, bytes):
+ # Store raw bytes as-is
+ s_val_encoded = value
Review Comment:
I just think that it is a very specific use case, if you are talking about
video or audio files, they are quite large and stored in S3 (hopefully, if the
custom xcom backend is configured), and if it is already stored in S3, why not
do the exact same thing the backend does but manually? I think that the better
way of storing raw bytes is writing them to s3 not as an xcom and saving the s3
object key as the xcom (just like the backend does).
Maybe it is a good idea to implement such a feature, but perhaps a
discussion could clarify things a little, as I cannot speak for the majority, I
think that it could be a very nice feature but first, I would suggest asking
the community, and holding a discussion about this topic.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]