Hi all,

We briefly discussed how pickling is currently used in Airflow codebase and
whether or not we should remove it for 2.0 in the Airflow 2.0 Dev call this
Monday.

Currently, AFAIK only *CeleryExecutor* supports pickling (code
<https://github.com/apache/airflow/blob/master/airflow/executors/executor_loader.py#L122-L126>).
We also have a flag on *airflow scheduler
<https://airflow.readthedocs.io/en/latest/cli-ref.html#scheduler> *CLI
command (*--do-pickle*) and "*--ship-dag*" on *airflow tasks run
<https://airflow.readthedocs.io/en/latest/cli-ref.html#run>* command.

If we want to remove pickling, I think Airflow 2.0 is the right time.

We have also deprecated the use of pickling in XComs.

https://docs.python.org/3/library/pickle.html -- lists some items on the
security implications of pickle and comparisons with JSON.

Another alternative is using *cloudpickle
<https://github.com/cloudpipe/cloudpickle> *(used by PySpark) instead
of *pickle,
*it suffers from the same security issues like *pickle *but does have some
more features compared to pickle.

What do you all think?

Regards,
Kaxil

Reply via email to