[
https://issues.apache.org/jira/browse/PROTON-1000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Pavel Moravec reopened PROTON-1000:
-----------------------------------
Reopening both PROTON-1000 and PROTON-1003: at least backport to 0.9 does not
fix it. Reproducer:
{code}
#!/usr/bin/python
from time import sleep
from uuid import uuid4
from proton import ConnectionException, Timeout
from proton import SSLDomain, SSLException
#from proton import Message
from proton.utils import BlockingConnection
import random
import threading
ROUTER_ADDRESS = "amqps://dispatch-router:5671"
ADDRESS = "some_destination"
HEARTBEAT = 2
TIMEOUT = 3
class ReceiverThread(threading.Thread):
def __init__(self,domain=None):
super(ReceiverThread, self).__init__()
self.domain=domain
self.running = True
def connect(self):
self.conn = BlockingConnection(ROUTER_ADDRESS, ssl_domain=self.domain,
heartbeat=HEARTBEAT)
self.recv = self.conn.create_receiver(ADDRESS, name=str(uuid4()),
dynamic=False, options=None)
def run(self):
while self.running:
self.connect()
while self.running:
try:
msg = self.recv.receive(TIMEOUT)
if (msg):
print "message received: %s" % msg
self.recv.accept()
except:
print "receiver failed to accept msg, reconnecting.."
try:
self.conn.close() # underlying TCP connection never gone
except:
print "receiver thread: failed to close connection"
pass
self.connect()
def stop(self):
self.running = False
ca_certificate='/etc/rhsm/ca/katello-default-ca.pem'
client_certificate='/etc/pki/consumer/bundle.pem'
client_key=None
domain = SSLDomain(SSLDomain.MODE_CLIENT)
domain.set_trusted_ca_db(ca_certificate)
domain.set_credentials(
client_certificate,
client_key or client_certificate, None)
domain.set_peer_authentication(SSLDomain.VERIFY_PEER)
rcv_thread = ReceiverThread(domain)
rcv_thread.start()
_in = raw_input("Press Enter to exit:")
rcv_thread.stop()
rcv_thread.join()
{code}
With SSL enabled (like above), there is an ESTABLISHED connection leak - `one
per `receiver failed to accept msg, reconnecting` log - `self.conn.close()` has
apparently no impact.
With SSL disabled (just set `ssl_domain=None`), there is a CLOSE_WAIT
connection leak - again once per `receiver failed to accept msg, reconnecting`
log.
> Connection leak on heartbeat-timeouted connections
> --------------------------------------------------
>
> Key: PROTON-1000
> URL: https://issues.apache.org/jira/browse/PROTON-1000
> Project: Qpid Proton
> Issue Type: Bug
> Components: python-binding
> Affects Versions: 0.9
> Reporter: Pavel Moravec
> Assignee: Gordon Sim
> Fix For: 0.11
>
>
> Using gofer/katello-agent that uses BlockingConnection from Proton Reactor
> with heartbeats set up, if some connection timeouts due to the heartbeats,
> Proton does not close the TCP connection. That causes TCP connection leak,
> despite gofer properly called BlockingConnection.close() and forgot any
> reference to that class instance.
> Checking tcpdump, Proton simply ignores the timeouted connections - it does
> not respond anyhow to the communication partner whatever it sends (in some
> scenarios it sends some AMQP performative that Proton was assumed to respond,
> in other scenario the communication peer dropped the TCP connection by
> sending FIN+ACK packet but Proton didn't send FIN packet back - the only
> stuff seen in tcpdump is ACKing on TCP layer made by OS, not by Proton). And
> Proton ignores an attempt of Proton reactor to close the
> connection/container, raising:
> Sep 21 15:02:35 my-capsule goferd: File
> "/usr/lib64/python2.7/site-packages/proton/utils.py", line 263, in
> on_transport_closed
> Sep 21 15:02:35 my-capsule goferd: raise ConnectionException("Connection %s
> disconnected" % self.url);
> Sep 21 15:02:35 my-capsule goferd: ConnectionException: Connection
> amqps://satellite.example.com:5647 disconnected
> for SSL connections, and raising:
> Sep 21 14:56:28 my-capsule goferd: File
> "/usr/lib64/python2.7/site-packages/proton/utils.py", line 259, in
> on_transport_tail_closed
> Sep 21 14:56:28 my-capsule goferd: self.on_transport_closed(event)
> Sep 21 14:56:28 my-capsule goferd: File
> "/usr/lib64/python2.7/site-packages/proton/utils.py", line 263, in
> on_transport_closed
> Sep 21 14:56:28 my-capsule goferd: raise ConnectionException("Connection %s
> disconnected" % self.url);
> Sep 21 14:56:28 my-capsule goferd: ConnectionException: Connection
> amqps://satellite.example.com:5647 disconnected
> (some difference between SSL and nonSSL could come from the fact that in my
> case the server part - qdrouterd / Qpid Dispatch Router - sends FIN+ACK
> packet for nonSSL connection, while it does not send anything for SSL
> connection and continue for sending empty AMQP frames due to heartbeats
> enabled forever)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)