[
https://issues.apache.org/jira/browse/DISPATCH-1110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16640371#comment-16640371
]
Chuck Rolke commented on DISPATCH-1110:
---------------------------------------
Thanks Robbie. That looks very similar.
I have observed that the QIT send test sends eight messages and then closes the
connection. I used PN_TRACE_EVT on both the QIT Sender and the router and see
the same pattern: the Sender client closes the connection immediately after
sending and after only 2, 3, or 4 accept dispositions have been received. A
typical Sender trace (with some extra application printf commentary) is
appended.
In the QIT test I don't see how to make the sender wait around for all the
messages to be accepted before closing the connection.
> PN_TRACE_EVT=1
> /opt/local/libexec/qpid_interop_test/shims/qpid-proton-cpp/amqp_large_content_test/Sender
> [::1]:5672 qit.amqp_large_content_test.map.ProtonCpp.ProtonCpp map '[[1, [1,
> 16, 256, 4096]], [10, [1, 16, 256, 4096]]]'
[0x22f3560]:(PN_CONNECTION_INIT, pn_connection<0x22ee070>)
[0x22f3560]:(PN_SESSION_INIT, pn_session<0x22ef600>)
[0x22f3560]:(PN_LINK_INIT, pn_link<0x22f0a70>)
[0x22f3560]:(PN_CONNECTION_BOUND, pn_connection<0x22ee070>)
[0x22f3560]:(PN_CONNECTION_REMOTE_OPEN, pn_connection<0x22ee070>)
[0x22f3560]:(PN_TRANSPORT, pn_transport<0x22f3560>)
[0x22f3560]:(PN_SESSION_REMOTE_OPEN, pn_session<0x22ef600>)
[0x22f3560]:(PN_LINK_REMOTE_OPEN, pn_link<0x22f0a70>)
[0x22f3560]:(PN_LINK_FLOW, pn_link<0x22f0a70>)
on_sendable: sent 8 messages
[0x22f3560]:(PN_TRANSPORT, pn_transport<0x22f3560>)
[0x22f3560]:(PN_LINK_FLOW, pn_link<0x22f0a70>)
on_sendable: doing nothing. Already sent 8 messages
[0x22f3560]:(PN_LINK_FLOW, pn_link<0x22f0a70>)
on_sendable: doing nothing. Already sent 8 messages
[0x22f3560]:(PN_DELIVERY, pn_delivery<0x2304440>{sending,
tag=b"\x01\x00\x00\x00\x00\x00\x00\x00", local=unknown, remote=accepted})
on_tracker_accept: msgsConfirmed 1
[0x22f3560]:(PN_TRANSPORT, pn_transport<0x22f3560>)
[0x22f3560]:(PN_LINK_FLOW, pn_link<0x22f0a70>)
on_sendable: doing nothing. Already sent 8 messages
[0x22f3560]:(PN_DELIVERY, pn_delivery<0x2325490>{sending,
tag=b"\x02\x00\x00\x00\x00\x00\x00\x00", local=unknown, remote=accepted})
on_tracker_accept: msgsConfirmed 2
[0x22f3560]:(PN_CONNECTION_LOCAL_CLOSE, pn_connection<0x22ee070>) <-- QIT
Sender closes connection
[0x22f3560]:(PN_TRANSPORT, pn_transport<0x22f3560>)
[0x22f3560]:(PN_LINK_FLOW, pn_link<0x22f0a70>)
on_sendable: doing nothing. Already sent 8 messages
[0x22f3560]:(PN_DELIVERY, pn_delivery<0x2301e70>{sending,
tag=b"\x03\x00\x00\x00\x00\x00\x00\x00", local=unknown, remote=accepted})
on_tracker_accept: msgsConfirmed 3
[0x22f3560]:(PN_TRANSPORT, pn_transport<0x22f3560>)
[0x22f3560]:(PN_TRANSPORT_HEAD_CLOSED, pn_transport<0x22f3560>)
[0x22f3560]:(PN_LINK_FLOW, pn_link<0x22f0a70>)
on_sendable: doing nothing. Already sent 8 messages
[0x22f3560]:(PN_DELIVERY, pn_delivery<0x252aa60>{sending,
tag=b"\x04\x00\x00\x00\x00\x00\x00\x00", local=unknown, remote=accepted})
on_tracker_accept: msgsConfirmed 4
[0x22f3560]:(PN_TRANSPORT, pn_transport<0x22f3560>)
[0x22f3560]:(PN_CONNECTION_REMOTE_CLOSE, pn_connection<0x22ee070>)
[0x22f3560]:(PN_TRANSPORT_TAIL_CLOSED, pn_transport<0x22f3560>)
[0x22f3560]:(PN_TRANSPORT_CLOSED, pn_transport<0x22f3560>)
on_transport_close sent: 8 , confirmed: 4
on_container_stop: msgsConfirmed 4
amqp_large_content_test Sender container.run() exited
> Intermittent router hang while running QIT's AMQP large content test
> --------------------------------------------------------------------
>
> Key: DISPATCH-1110
> URL: https://issues.apache.org/jira/browse/DISPATCH-1110
> Project: Qpid Dispatch
> Issue Type: Bug
> Environment: Standard QIT environment.
> Once QIT is built and installed, the environment is set using the config.sh
> file. See QUICKSTART for details.
> Reporter: Kim van der Riet
> Assignee: Ganesh Murthy
> Priority: Major
> Attachments: qdrouterd.conf
>
>
> When running the Qpid Interop Test's AMQP large content test, a stand-alone
> router will intermittently hang and cause the test to time out.
> The failure appears to be limited to either the AMQP list or map types, and
> usually with the C++ client as the message sender. The C++, Python2 and
> Python3 as receiver clients have all seen this failure, but the Python2
> receiver client seems to reproduce more readily on my hardware.
> In all cases, the test fails when the router sends what I suppose is the
> final transfer of a large message (I have not added up/counted the bytes of
> the many preceding transfers) to the consumer. The consumer then sends a
> disposition, but the router does not respond again until the test times out.
> The consumer can be seen to send heartbeats to the router, but the router
> does not send any of its own.
> {noformat}
> ... (plenty of 65550-sized frames R->C)
> R->C 5976 3.454766 ::1 ::1 AMQP 65550
> R->C 5977 3.454775 ::1 ::1 AMQP 65550
> R->C 5978 3.454783 ::1 ::1 AMQP 48171
> C->R 5982 3.529881 ::1 ::1 AMQP 115 disposition
> C->R 5984 7.530704 ::1 ::1 AMQP 94 (empty)
> C->R 5986 11.532306 ::1 ::1 AMQP 94 (empty)
> ...{noformat}
> There are no errors to be seen in the router logs other than when the
> consuming client is killed owing to the test timeout.
> {noformat}
> ...
> 2018-08-29 12:50:23.191754 -0400 SERVER (info) [14]: Accepted connection to
> ::1:amqp from ::1:37262
> 2018-08-29 12:51:19.562695 -0400 SERVER (info) [14]: Connection from
> ::1:37262 (to ::1:amqp) failed: amqp:connection:framing-error connection
> aborted
> {noformat}
> The reproducer is not very tight on this, and the error occurs about 50% of
> the time on my hardware.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]