emredjan opened a new issue, #54182: URL: https://github.com/apache/airflow/issues/54182
### Apache Airflow version 3.0.3 ### If "Other Airflow 2 version" selected, which one? _No response_ ### What happened? The new XCom object storage backend can use snappy compression algoright through `python-snappy` if it's installed according to the documentation. However, during xcom push, it fails with the error: `ValueError: Compression snappy is not supported. Make sure it is installed.` The reason for this is the XCom object storage backend uses the output from `fsspec.utils.compressions` to check if a compression is available in the file [`providers/common/io/xcom/backend.py`](https://github.com/apache/airflow/blob/main/providers/common/io/src/airflow/providers/common/io/xcom/backend.py): ```python def _get_compression_suffix(compression: str) -> str: """ Return the compression suffix for the given compression. :raises ValueError: if the compression is not supported """ for suffix, c in fsspec.utils.compressions.items(): if c == compression: return suffix raise ValueError(f"Compression {compression} is not supported. Make sure it is installed.") ``` However this object is not a dynamic list and will not include `snappy` even if it's installed: ```python In [1]: fsspec.utils.compressions Out[1]: {'zip': 'zip', 'bz2': 'bz2', 'gz': 'gzip', 'lzma': 'lzma', 'xz': 'xz'} ``` The correct way is to check the `fsspec.available_compressions()` for a dynamically updated list: ```python In [2]: fsspec.available_compressions() Out[2]: [None, 'zip', 'bz2', 'gzip', 'lzma', 'xz', 'snappy'] ``` ### What you think should happen instead? It should use snappy for XCom compression when `python-snappy` is installed, and `snappy` is selected in config as `xcom_objectstorage_compression` ### How to reproduce - Install `python-snappy` in yoru environment - enter `airflow.providers.common.io.xcom.backend.XComObjectStorageBackend` as `xcom_backend` in config - enter `snappy` as `xcom_objectstorage_compression` in config - Run a DAG with an xcom larger than `xcom_objectstorage_threshold` ### Operating System RHEL 8.10 ### Versions of Apache Airflow Providers ```shell apache-airflow-providers-celery==3.12.1 apache-airflow-providers-common-compat==1.7.2 apache-airflow-providers-common-io==1.6.1 apache-airflow-providers-common-sql==1.27.3 apache-airflow-providers-fab==2.3.0 apache-airflow-providers-ftp==3.13.1 apache-airflow-providers-hashicorp==4.3.1 apache-airflow-providers-http==5.3.2 apache-airflow-providers-imap==3.9.1 apache-airflow-providers-microsoft-azure==12.5.0 apache-airflow-providers-microsoft-mssql==4.3.1 apache-airflow-providers-microsoft-psrp==3.1.1 apache-airflow-providers-mysql==6.3.2 apache-airflow-providers-odbc==4.10.1 apache-airflow-providers-sftp==5.3.2 apache-airflow-providers-smtp==2.1.1 apache-airflow-providers-sqlite==4.1.1 apache-airflow-providers-ssh==4.1.1 apache-airflow-providers-standard==1.4.1 ``` ### Deployment Virtualenv installation ### Deployment details CeleryExecutor ### Anything else? _No response_ ### Are you willing to submit PR? - [x] Yes I am willing to submit a PR! ### Code of Conduct - [x] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
