uranusjr commented on code in PR #45562:
URL: https://github.com/apache/airflow/pull/45562#discussion_r1917646149
##########
airflow/serialization/serialized_objects.py:
##########
@@ -297,6 +306,25 @@ def decode_asset_condition(var: dict[str, Any]) ->
BaseAsset:
raise ValueError(f"deserialization not implemented for DAT {dat!r}")
+def decode_asset(var: dict[str, Any]) -> BaseAsset:
+ """
+ Decode a previously serialized asset.
+
+ :meta private:
+ """
+
+ def _decode_trigger(trigger_infos: dict[str, Any]) -> BaseTrigger:
+ return
import_string(trigger_infos["classpath"])(**trigger_infos["kwargs"])
Review Comment:
We should check if we are importing from a known location. This is how
timetables are decoded; it must be defined under the `airflow` namespace, or
from a plugin-registered location. be7efb1d30929a7f742f5b7735a3d6fbadadd352
With that said, I don’t think we actually need the full trigger objects here
in the first place? The decoded trigger here simply gets serialised again to be
stored in the database anyway.
https://github.com/apache/airflow/blob/93441c9adb3ca7be86d9dd8943a968ed93cc015f/airflow/dag_processing/collection.py#L748-L808
It’s likely better to simply have a serialised trigger representation here
instead, similar to how MappedOperator gets deserialised. MappedOperator does
not import the underlying operator class when deserialised, but only store the
needed values in a dict, and only actually fully saturate the class when in the
worker.
https://github.com/apache/airflow/blob/93441c9adb3ca7be86d9dd8943a968ed93cc015f/airflow/models/mappedoperator.py#L257-L262
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]