We attempted to use zipped dags in the past and ran into a bunch of issues.
I've stopped trying to get them to work and just use a docker operator or
virtualenv operator when I need custom dependencies.
+1 on supporting dag serialization or remote fetching, would work around a
bunch of problems.
+1 on the backfill CLI command being a wrapper around submitting a job to
the REST API.
Since backfills run client-side as a CLI command, if something goes wrong
on that node temporarily then the backfill will get killed and never
restart. When a backfill dies over the night and you have to restar
The challenge with using yaml to define the pod spec is we need to inject
values into the yaml in order for the pod to work properly.
For example, if you try setting the command property, then the pod will not
actually run the airflow command to start the task. Same idea with needed
environmental