nicnguyen3103 opened a new issue #20394:
URL: https://github.com/apache/airflow/issues/20394


   ### Apache Airflow version
   
   2.2.2 (latest released)
   
   ### What happened
   
   I tried to upgrade to Airflow 2.2.2 and the celery worker dies immediately 
upon startup with the command airflow celery worker. Celery is on 5.1.2, I did 
not encounter this issue in the airflow version 2.1.4 with celery 4.4.5.
   
   The full log can be found in this file 
   
[airflow_celery_error_log.txt](https://github.com/apache/airflow/files/7738675/airflow_celery_error_log.txt)
   
   Both the mysql and redis was hosted on the internal network of docker swarm. 
Since the documentation does not mentioned any security with the celery 
deserialize message, could you help me to look into this issue? Many thanks 
   
   ### What you expected to happen
   
   Celery-kombu refuse to accept pickle content. The default accept content of 
celery is Json 
https://docs.celeryproject.org/en/stable/userguide/configuration.html#std-setting-accept_content,
 I tried to change it to pickle via the environment variable 
`CELERY_ACCEPT_CONTENT=pickle` in docker compose but airflow celery worker 
don't pick it up 
   ```
   [tasks]
     . airflow.executors.celery_executor.execute_command
   
   [2021-12-18 03:31:55,136: INFO/MainProcess] Connected to redis://redis:6379/0
   [2021-12-18 03:31:55,145: INFO/MainProcess] mingle: searching for neighbors
   [2021-12-18 03:31:56,163: INFO/MainProcess] mingle: all alone
   [2021-12-18 03:31:56,177: INFO/MainProcess] celery@stg-worker1 ready.
   [2021-12-18 03:31:56,179: CRITICAL/MainProcess] Can't decode message body: 
ContentDisallowed('Refusing to deserialize untrusted content of type pickle 
(application/x-python-serialize)') [type:'application/x-python-serialize' 
encoding:'binary' headers:{'sentry-trace': 
'99d3c3760460417ebdd83936d93084c4-856283284c7dc7c9-', 'headers': 
{'sentry-trace': '99d3c3760460417ebdd83936d93084c4-856283284c7dc7c9-'}}]
   
   body: 
b'\x80\x02}q\x00(X\x04\x00\x00\x00taskq\x01X\x16\x00\x00\x00sentry.tasks.send_pingq\x02X\x02\x00\x00\x00idq\x03X$\x00\x00\x00655800b5-421a-4715-87cb-14d46b9b7013q\x04X\x04\x00\x00\x00argsq\x05)X\x06\x00\x00\x00kwargsq\x06}q\x07X\x05\x00\x00\x00groupq\x08NX\x0b\x00\x00\x00group_indexq\tNX\x07\x00\x00\x00retriesq\nK\x00X\x03\x00\x00\x00etaq\x0bNX\x07\x00\x00\x00expiresq\x0cX
 
\x00\x00\x002021-12-18T03:32:44.624948+00:00q\rX\x03\x00\x00\x00utcq\x0e\x88X\t\x00\x00\x00callbacksq\x0fNX\x08\x00\x00\x00errbacksq\x10NX\t\x00\x00\x00timelimitq\x11NN\x86q\x12X\x07\x00\x00\x00tasksetq\x13NX\x05\x00\x00\x00chordq\x14Nu.'
 (333b)
   Traceback (most recent call last):
     File 
"/home/nonroot/airflow/airflowvirtual/lib/python3.8/site-packages/celery/worker/consumer/consumer.py",
 line 568, in on_task_received
       type_ = message.headers['task']  # protocol v2
   KeyError: 'task'
   
   During handling of the above exception, another exception occurred:
   
   Traceback (most recent call last):
     File 
"/home/nonroot/airflow/airflowvirtual/lib/python3.8/site-packages/celery/worker/consumer/consumer.py",
 line 573, in on_task_received
       payload = message.decode()
     File 
"/home/nonroot/airflow/airflowvirtual/lib/python3.8/site-packages/kombu/message.py",
 line 194, in decode
       self._decoded_cache = self._decode()
     File 
"/home/nonroot/airflow/airflowvirtual/lib/python3.8/site-packages/kombu/message.py",
 line 198, in _decode
       return loads(self.body, self.content_type,
     File 
"/home/nonroot/airflow/airflowvirtual/lib/python3.8/site-packages/kombu/serialization.py",
 line 242, in loads
       raise self._for_untrusted_content(content_type, 'untrusted')
   kombu.exceptions.ContentDisallowed: Refusing to deserialize untrusted 
content of type pickle (application/x-python-serialize)
   [2021-12-18 03:31:56,181: CRITICAL/MainProcess] Can't decode message body: 
ContentDisallowed('Refusing to deserialize untrusted content of type pickle 
(application/x-python-serialize)') [type:'application/x-python-serialize' 
encoding:'binary' headers:{'sentry-trace': 
'df0cfb36633b4d34b0ebf2a293f13e25-9b0212d02c6be824-', 'headers': 
{'sentry-trace': 'df0cfb36633b4d34b0ebf2a293f13e25-9b0212d02c6be824-'}}]
   
   body: 
b'\x80\x02}q\x00(X\x04\x00\x00\x00taskq\x01X#\x00\x00\x00sentry.tasks.enqueue_scheduled_jobsq\x02X\x02\x00\x00\x00idq\x03X$\x00\x00\x007d3c5bc2-06bd-42d5-aa3d-ea36f44b2360q\x04X\x04\x00\x00\x00argsq\x05)X\x06\x00\x00\x00kwargsq\x06}q\x07X\x05\x00\x00\x00groupq\x08NX\x0b\x00\x00\x00group_indexq\tNX\x07\x00\x00\x00retriesq\nK\x00X\x03\x00\x00\x00etaq\x0bNX\x07\x00\x00\x00expiresq\x0cX
 
\x00\x00\x002021-12-18T03:32:44.670615+00:00q\rX\x03\x00\x00\x00utcq\x0e\x88X\t\x00\x00\x00callbacksq\x0fNX\x08\x00\x00\x00errbacksq\x10NX\t\x00\x00\x00timelimitq\x11NN\x86q\x12X\x07\x00\x00\x00tasksetq\x13NX\x05\x00\x00\x00chordq\x14Nu.'
 (346b)
   Traceback (most recent call last):
     File 
"/home/nonroot/airflow/airflowvirtual/lib/python3.8/site-packages/celery/worker/consumer/consumer.py",
 line 568, in on_task_received
       type_ = message.headers['task']  # protocol v2
   KeyError: 'task'
   ```
   
   ### How to reproduce
   
   1. Install a non-root airflow container. The command i used in dockerfile 
was: 
   `RUN . $AIRFLOW_VENV/bin/activate && pip install 
'apache-airflow[mysql,amazon,celery,papermill]==2.2.2' --constraint 
"https://raw.githubusercontent.com/apache/airflow/constraints-2.2.2/constraints-3.8.txt"`
   2. Install `apache-airflow[statsd] and apache-airflow[sentry]` as there is 
no provider for them
   3.  Setup both redis and mysql in the same docker network. Celery worker 
will connect to redis as broker and mysql as result
   4. Setup sentry on the same network and connect use SENTRY_DSN.
   5. Start celery worker 
   
   ### Operating System
   
   Ubuntu "20.04.2 LTS (Focal Fossa)"
   
   ### Versions of Apache Airflow Providers
   
    pip install 'apache-airflow[mysql,amazon,celery,papermill]==2.2.2' 
--constraint 
"https://raw.githubusercontent.com/apache/airflow/constraints-2.2.2/constraints-3.8.txt
   
   ### Deployment
   
   Docker-Compose
   
   ### Deployment details
   
   All server is running on docker swarm with engine version 20.10.7
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to