[ 
https://issues.apache.org/jira/browse/AIRFLOW-6529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17012711#comment-17012711
 ] 

ASF GitHub Bot commented on AIRFLOW-6529:
-----------------------------------------

sarutak commented on pull request #7128: [AIRFLOW-6529] Serialization error 
occurs when the scheduler tries to run on macOS.
URL: https://github.com/apache/airflow/pull/7128
 
 
   When we try to run the scheduler on macOS, we will get a serialization error 
like as follows.
   ```
     ____________       _____________
    ____    |__( )_________  __/__  /________      __
   ____  /| |_  /__  ___/_  /_ __  /_  __ \_ | /| / /
   ___  ___ |  / _  /   _  __/ _  / / /_/ /_ |/ |/ /
    _/_/  |_/_/  /_/    /_/    /_/  \____/____/|__/
   [2020-01-10 19:54:41,974] {executor_loader.py:59} INFO - Using executor 
SequentialExecutor
   [2020-01-10 19:54:41,983] {scheduler_job.py:1462} INFO - Starting the 
scheduler
   [2020-01-10 19:54:41,984] {scheduler_job.py:1469} INFO - Processing each 
file at most -1 times
   [2020-01-10 19:54:41,984] {scheduler_job.py:1472} INFO - Searching for files 
in /Users/sarutak/airflow/dags
   [2020-01-10 19:54:42,025] {scheduler_job.py:1474} INFO - There are 27 files 
in /Users/sarutak/airflow/dags
   [2020-01-10 19:54:42,025] {scheduler_job.py:1527} INFO - Resetting orphaned 
tasks for active dag runs
   [2020-01-10 19:54:42,059] {scheduler_job.py:1500} ERROR - Exception when 
executing execute_helper
   Traceback (most recent call last):
     File 
"/Users/sarutak/work/oss/airflow-env/master-python3.8.1/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py",
 line 1498, in _execute
       self._execute_helper()
     File 
"/Users/sarutak/work/oss/airflow-env/master-python3.8.1/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py",
 line 1531, in _execute_helper
       self.processor_agent.start()
     File 
"/Users/sarutak/work/oss/airflow-env/master-python3.8.1/lib/python3.8/site-packages/airflow/utils/dag_processing.py",
 line 348, in start
       self._process.start()
     File "/opt/python/3.8.1/lib/python3.8/multiprocessing/process.py", line 
121, in start
       self._popen = self._Popen(self)
     File "/opt/python/3.8.1/lib/python3.8/multiprocessing/context.py", line 
224, in _Popen
       return _default_context.get_context().Process._Popen(process_obj)
     File "/opt/python/3.8.1/lib/python3.8/multiprocessing/context.py", line 
283, in _Popen
       return Popen(process_obj)
     File 
"/opt/python/3.8.1/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 
32, in __init__
       super().__init__(process_obj)
     File "/opt/python/3.8.1/lib/python3.8/multiprocessing/popen_fork.py", line 
19, in __init__
       self._launch(process_obj)
     File 
"/opt/python/3.8.1/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 
47, in _launch
       reduction.dump(process_obj, fp)
     File "/opt/python/3.8.1/lib/python3.8/multiprocessing/reduction.py", line 
60, in dump
       ForkingPickler(file, protocol).dump(obj)
   AttributeError: Can't pickle local object 
'SchedulerJob._execute.<locals>.processor_factory'
   ```
   
   The reason is scheduler try to run subprocesses using multiprocessing with 
spawn mode.
   Actually, as of Python 3.8, spawn mode is the default mode in macOS.
   
   ---
   Issue link: WILL BE INSERTED BY 
[boring-cyborg](https://github.com/kaxil/boring-cyborg)
   
   - [x] Description above provides context of the change
   - [x] Commit message/PR title starts with `[AIRFLOW-NNNN]`. AIRFLOW-NNNN = 
JIRA ID<sup>*</sup>
   - [x] Unit tests coverage for changes (not needed for documentation changes)
   - [x] Commits follow "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)"
   - [x] Relevant documentation is updated including usage instructions.
   - [x] I will engage committers as explained in [Contribution Workflow 
Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   <sup>*</sup> For document-only changes commit message can start with 
`[AIRFLOW-XXXX]`.
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)
 for more information.
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Serialization error occurs when the scheduler tries to run on macOS.
> --------------------------------------------------------------------
>
>                 Key: AIRFLOW-6529
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-6529
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: scheduler
>    Affects Versions: 1.10.8
>         Environment: macOS
> Python 3.8
> multiprocessing with spawn mode
>            Reporter: Kousuke Saruta
>            Assignee: Kousuke Saruta
>            Priority: Major
>
> When we try to run the scheduler on macOS, we will get a serialization error 
> like as follows.
> {code}
>   ____________       _____________
>  ____    |__( )_________  __/__  /________      __
> ____  /| |_  /__  ___/_  /_ __  /_  __ \_ | /| / /
> ___  ___ |  / _  /   _  __/ _  / / /_/ /_ |/ |/ /
>  _/_/  |_/_/  /_/    /_/    /_/  \____/____/|__/
> [2020-01-10 19:54:41,974] {executor_loader.py:59} INFO - Using executor 
> SequentialExecutor
> [2020-01-10 19:54:41,983] {scheduler_job.py:1462} INFO - Starting the 
> scheduler
> [2020-01-10 19:54:41,984] {scheduler_job.py:1469} INFO - Processing each file 
> at most -1 times
> [2020-01-10 19:54:41,984] {scheduler_job.py:1472} INFO - Searching for files 
> in /Users/sarutak/airflow/dags
> [2020-01-10 19:54:42,025] {scheduler_job.py:1474} INFO - There are 27 files 
> in /Users/sarutak/airflow/dags
> [2020-01-10 19:54:42,025] {scheduler_job.py:1527} INFO - Resetting orphaned 
> tasks for active dag runs
> [2020-01-10 19:54:42,059] {scheduler_job.py:1500} ERROR - Exception when 
> executing execute_helper
> Traceback (most recent call last):
>   File 
> "/Users/sarutak/work/oss/airflow-env/master-python3.8.1/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py",
>  line 1498, in _execute
>     self._execute_helper()
>   File 
> "/Users/sarutak/work/oss/airflow-env/master-python3.8.1/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py",
>  line 1531, in _execute_helper
>     self.processor_agent.start()
>   File 
> "/Users/sarutak/work/oss/airflow-env/master-python3.8.1/lib/python3.8/site-packages/airflow/utils/dag_processing.py",
>  line 348, in start
>     self._process.start()
>   File "/opt/python/3.8.1/lib/python3.8/multiprocessing/process.py", line 
> 121, in start
>     self._popen = self._Popen(self)
>   File "/opt/python/3.8.1/lib/python3.8/multiprocessing/context.py", line 
> 224, in _Popen
>     return _default_context.get_context().Process._Popen(process_obj)
>   File "/opt/python/3.8.1/lib/python3.8/multiprocessing/context.py", line 
> 283, in _Popen
>     return Popen(process_obj)
>   File 
> "/opt/python/3.8.1/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 
> 32, in __init__
>     super().__init__(process_obj)
>   File "/opt/python/3.8.1/lib/python3.8/multiprocessing/popen_fork.py", line 
> 19, in __init__
>     self._launch(process_obj)
>   File 
> "/opt/python/3.8.1/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 
> 47, in _launch
>     reduction.dump(process_obj, fp)
>   File "/opt/python/3.8.1/lib/python3.8/multiprocessing/reduction.py", line 
> 60, in dump
>     ForkingPickler(file, protocol).dump(obj)
> AttributeError: Can't pickle local object 
> 'SchedulerJob._execute.<locals>.processor_factory'
> {code}
> The reason is scheduler try to run subprocesses using multiprocessing with 
> spawn mode.
> Actually, as of Python 3.8, spawn mode is the default mode in macOS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to