[ https://issues.apache.org/jira/browse/QPID-5637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Brian Bouterse updated QPID-5637: --------------------------------- Description: qpid.messaging has an issue with forking in the following situation. 1. A parent Python process imports and uses qpid.messaging to connect to a a Qpid broker 2. The parent process forks a child process 3. The child process imports qpid.messaging and tries to connect to a Qpid broker. I expected to see the child process use qpid.messaging normally as it would if it weren't forked in the way described above. Instead, the server receives the opening of a TCP socket, but the client never sends the AMQP protocol announcement. [Forking bring child descriptors with it|http://man7.org/linux/man-pages/man2/fork.2.html]. I expected the file descriptors on the parent and the child to be the same, and to reference the same socket, so I expect qpid.messaging to work without any modification. Surprisingly, it does not. There is at least one place where I do understand how this can be avoided. One of the issues is that the file descriptors registered by the Selector object inside of qpid.messaging are stale after the fork. The Selector object uses a singleton pattern to provide a reference to the same Selector object no matter how many times you call it. This selector object already has registered file descriptors with the filesystem, which allow the selector to read/write data in an I/O efficient manner. See the attached [pid_aware_selector.patch] for an example of this. The [pid_aware_selector.patch] does allow communication to flow, but queue creation and deletion sometimes fail in strange ways. For instance, in the child process, code that creates a queue, reads information about that queue next. The queue was created, yet the read says that the queue can't be found. Very strange. You can see those things fail using the following simple example: 1. clone our fork of kombu: `git clone g...@github.com:pulp/kombu.git` 2. Change into the kombu folder `cd kombu` 3. Switch to the branch containing the qpid code: `git checkout pulp-dep-3.0.15-with-qpid` 4. Install kombu onto your system or virtualenv (I do it systemwide using sudo): `sudo python setup.py develop` 5. install celery version 3.1.11. I do it using pip. `sudo pip install celery==3.1.11` 6. Install qpid.messaging and qpidtoollibs. One way I do it is systemwide using pip. `sudo pip install qpid-tools qpid-python` 7. Start up qpidd (We've been testing with 0.24 and auth off). `sudo -u qpidd qpidd --auth=no` 8. Put the attached file tasks.py into a directory 9. Open two terminals and change their working directory to be the same as step 8. 10. In one one terminal start the celery worker `celery worker -A tasks --loglevel=INFO -c1` 11. In the other terminal dispatch 10 tasks `python tasks.py` You should see exceptions raised similar to those in the attached file [celery_worker_output.txt] Note, that the code on the pulp-dep-3.0.15-with-qpid branch of kombu monkey patches qpid.messaging with the selector patch referenced above, and also one or two other bugfix patches. You can see that [monkey patching done here|https://github.com/pulp/kombu/blob/pulp-dep-3.0.15-with-qpid/kombu/transport/qpid.py#L45]. This should have no implications on this issue, but I want to be explicit about it. A potential fix: Celery supports a callback after child processes are forked, allowing the call to cleanup/reset exactly these types of things. I could wire up that callback if such a thing existed on qpid.messaging. For testing purposes, you could put code you want to do this cleanup in the task code before the call to was: qpid.messaging has an issue with forking in the following situation. 1. A parent Python process imports and uses qpid.messaging to connect to a a Qpid broker 2. The parent process forks a child process 3. The child process imports qpid.messaging and tries to connect to a Qpid broker. I expected to see the child process use qpid.messaging normally as it would if it weren't forked in the way described above. Instead, the server receives the opening of a TCP socket, but the client never sends the AMQP protocol announcement. [Forking bring child descriptors with it|http://man7.org/linux/man-pages/man2/fork.2.html]. I expected the file descriptors on the parent and the child to be the same, and to reference the same socket, so I expect qpid.messaging to work without any modification. Surprisingly, it does not. There is at least one place where I do understand how this can be avoided. One of the issues is that the file descriptors registered by the Selector object inside of qpid.messaging are stale after the fork. The Selector object uses a singleton pattern to provide a reference to the same Selector object no matter how many times you call it. This selector object already has registered file descriptors with the filesystem, which allow the selector to read/write data in an I/O efficient manner. See the attached [pid_aware_selector.patch] for an example of this. The [pid_aware_selector.patch] does allow communication to flow, but queue creation, and deletion seem to fail. You can see those things fail using the following simple example: 1. clone our fork of kombu: `git clone g...@github.com:pulp/kombu.git` 2. Change into the kombu folder `cd kombu` 3. Switch to the branch containing the qpid code: `git checkout pulp-dep-3.0.15-with-qpid` 4. Install kombu onto your system or virtualenv (I do it systemwide using sudo): `sudo python setup.py develop` 5. install celery version 3.1.11. I do it using pip. `sudo pip install celery==3.1.11` 6. Install qpid.messaging and qpidtoollibs. One way I do it is systemwide using pip. `sudo pip install qpid-tools qpid-python` 7. Start up qpidd (We've been testing with 0.24 and auth off). `sudo -u qpidd qpidd --auth=no` 8. Put the attached file tasks.py into a directory 9. Open two terminals and change their working directory to be the same as step 8. 10. In one one terminal start the celery worker `celery worker -A tasks --loglevel=INFO -c1` 11. In the other terminal dispatch 10 tasks `python tasks.py` You should see exceptions raised similar to those in the attached file [celery_worker_output.txt] Note, that the code on the pulp-dep-3.0.15-with-qpid branch of kombu monkey patches qpid.messaging with the selector patch referenced above, and also one or two other bugfix patches. You can see that [monkey patching done here|https://github.com/pulp/kombu/blob/pulp-dep-3.0.15-with-qpid/kombu/transport/qpid.py#L45]. This should have no implications on this issue, but I want to be explicit about it. > qpid.messaging Issues With Forking > ---------------------------------- > > Key: QPID-5637 > URL: https://issues.apache.org/jira/browse/QPID-5637 > Project: Qpid > Issue Type: Bug > Components: Python Client > Affects Versions: 0.24 > Reporter: Brian Bouterse > Fix For: 0.18, 0.22, 0.24, 0.26, 0.27 > > Attachments: celery_worker_output.txt, pid_aware_selector.patch, > tasks.py > > > qpid.messaging has an issue with forking in the following situation. > 1. A parent Python process imports and uses qpid.messaging to connect to a a > Qpid broker > 2. The parent process forks a child process > 3. The child process imports qpid.messaging and tries to connect to a Qpid > broker. > I expected to see the child process use qpid.messaging normally as it would > if it weren't forked in the way described above. Instead, the server > receives the opening of a TCP socket, but the client never sends the AMQP > protocol announcement. > [Forking bring child descriptors with > it|http://man7.org/linux/man-pages/man2/fork.2.html]. I expected the file > descriptors on the parent and the child to be the same, and to reference the > same socket, so I expect qpid.messaging to work without any modification. > Surprisingly, it does not. > There is at least one place where I do understand how this can be avoided. > One of the issues is that the file descriptors registered by the Selector > object inside of qpid.messaging are stale after the fork. The Selector > object uses a singleton pattern to provide a reference to the same Selector > object no matter how many times you call it. This selector object already > has registered file descriptors with the filesystem, which allow the selector > to read/write data in an I/O efficient manner. See the attached > [pid_aware_selector.patch] for an example of this. > The [pid_aware_selector.patch] does allow communication to flow, but queue > creation and deletion sometimes fail in strange ways. For instance, in the > child process, code that creates a queue, reads information about that queue > next. The queue was created, yet the read says that the queue can't be > found. Very strange. You can see those things fail using the following > simple example: > 1. clone our fork of kombu: `git clone g...@github.com:pulp/kombu.git` > 2. Change into the kombu folder `cd kombu` > 3. Switch to the branch containing the qpid code: `git checkout > pulp-dep-3.0.15-with-qpid` > 4. Install kombu onto your system or virtualenv (I do it systemwide using > sudo): `sudo python setup.py develop` > 5. install celery version 3.1.11. I do it using pip. `sudo pip install > celery==3.1.11` > 6. Install qpid.messaging and qpidtoollibs. One way I do it is systemwide > using pip. `sudo pip install qpid-tools qpid-python` > 7. Start up qpidd (We've been testing with 0.24 and auth off). `sudo -u > qpidd qpidd --auth=no` > 8. Put the attached file tasks.py into a directory > 9. Open two terminals and change their working directory to be the same as > step 8. > 10. In one one terminal start the celery worker `celery worker -A > tasks --loglevel=INFO -c1` > 11. In the other terminal dispatch 10 tasks `python tasks.py` > You should see exceptions raised similar to those in the attached file > [celery_worker_output.txt] > Note, that the code on the pulp-dep-3.0.15-with-qpid branch of kombu monkey > patches qpid.messaging with the selector patch referenced above, and also one > or two other bugfix patches. You can see that [monkey patching done > here|https://github.com/pulp/kombu/blob/pulp-dep-3.0.15-with-qpid/kombu/transport/qpid.py#L45]. > This should have no implications on this issue, but I want to be explicit > about it. > A potential fix: Celery supports a callback after child processes are > forked, allowing the call to cleanup/reset exactly these types of things. I > could wire up that callback if such a thing existed on qpid.messaging. For > testing purposes, you could put code you want to do this cleanup in the task > code before the call to -- This message was sent by Atlassian JIRA (v6.2#6252) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org For additional commands, e-mail: dev-h...@qpid.apache.org