[GitHub] [airflow] dimberman commented on issue #6627: [AIRFLOW-5931] Use os.fork when appropriate to speed up task execution.
dimberman commented on issue #6627: [AIRFLOW-5931] Use os.fork when appropriate to speed up task execution. URL: https://github.com/apache/airflow/pull/6627#issuecomment-558849438 @ashb did you ever end up doing a timing comparison with the multiprocessing based solution? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] dimberman commented on issue #6627: [AIRFLOW-5931] Use os.fork when appropriate to speed up task execution.
dimberman commented on issue #6627: [AIRFLOW-5931] Use os.fork when appropriate to speed up task execution. URL: https://github.com/apache/airflow/pull/6627#issuecomment-557301508 @bolkedebruin the understanding I have is that when you spawn a totally new process, you are reinitializing the interpreter, re-loading all dependencies, and restarting airflow. using os.fork directly allows you to keep the same memory state (at least that's what I understood from @ashb ) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] dimberman commented on issue #6627: [AIRFLOW-5931] Use os.fork when appropriate to speed up task execution.
dimberman commented on issue #6627: [AIRFLOW-5931] Use os.fork when appropriate to speed up task execution. URL: https://github.com/apache/airflow/pull/6627#issuecomment-557301118 @bolkedebruin having the tasks access the DB is a central part of the k8sexecutor. Unless we want to set up some sort of messaging system/message queue This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] dimberman commented on issue #6627: [AIRFLOW-5931] Use os.fork when appropriate to speed up task execution.
dimberman commented on issue #6627: [AIRFLOW-5931] Use os.fork when appropriate to speed up task execution. URL: https://github.com/apache/airflow/pull/6627#issuecomment-557283842 @mik-laj it's because multiprocessing.Process has to re-parse all dependencies/DAGs. It causes a lot of slowdown. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] dimberman commented on issue #6627: [AIRFLOW-5931] Use os.fork when appropriate to speed up task execution.
dimberman commented on issue #6627: [AIRFLOW-5931] Use os.fork when appropriate to speed up task execution. URL: https://github.com/apache/airflow/pull/6627#issuecomment-557130023 I have tested this locally and it seems to work fine. @ashb when are situations where CAN_FORK is false besides when doing run_as_user? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services