[ 
https://issues.apache.org/jira/browse/PROTON-907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gordon Sim updated PROTON-907:
------------------------------
    Attachment: PROTON-907-workaround.patch

The issue appears to be that on the affected platforms, when unable to connect, 
the file descriptor is not marked as writeable.

Though it hits the read error, messenger only closes the 'tail' of the 
transport as a result. The head is closed when an error is returned from send, 
but as the socket is not writeable, send is never called.

I don't know what the real fix for this is, messenger is an area of the code 
I'm even less familiar with. Fwiw the attached patch works around the issue and 
passes all the existing tests. It works by explicitly closing the head of the 
transport if there is an error on reading from the socket and the connection 
has not been closed by the peer.

> Qpid Proton Point to Point Hang on CentOS 6 pn_messenger_send
> -------------------------------------------------------------
>
>                 Key: PROTON-907
>                 URL: https://issues.apache.org/jira/browse/PROTON-907
>             Project: Qpid Proton
>          Issue Type: Bug
>          Components: proton-c
>    Affects Versions: 0.8, 0.9.1
>         Environment: CentOS 6 (both VM and native 64-bit) and RHEL 6
>            Reporter: Frank Quinn
>            Priority: Critical
>         Attachments: PROTON-907-workaround.patch
>
>
> See thread at 
> http://qpid.2158936.n2.nabble.com/Strange-behaviour-for-pn-messenger-send-on-CentOS-6-td7625846.html.
> Key points:
> * pn_messenger_send will hang on CentOS 6 if the destination is not yet up
> * Works fine on Fedora 21 and 22 (by 'fine', i mean it will attempt to send, 
> fail and move on)
> * Can be recreated by running the send.c application when recv.c is not yet 
> running
> * Proton burns CPU as it hangs
> This effectively deadlocks our application. So far, I’ve tried compiling qpid 
> proton c myself (both 0.8 and 0.9.1), setting pn_messenger_send timeout to 1 
> (it was previously -1), turning off iptables entirely and disabling selinux 
> and rebooting but no luck. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to