[
https://issues.apache.org/jira/browse/DISPATCH-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16471157#comment-16471157
]
Gordon Sim commented on DISPATCH-994:
-------------------------------------
Digging a little deeper, the issue here is the re-use of a link name on a
session before the previous use has fully closed. The test case attached here
is arguably incorrect, as it does not wait for the connection to be confirmed
as closed before resubscribing with the same link names. However even a
modified version that does so can cause the same problem. DISPATCH-997 is a
different symptom of the same route cause, and the test there also waits for
connection close before reusing. If router-c is run under valrgind, that too
can trigger this segfault.
The only way to avoid it would be for the application to wait for the link
detach to be confirmed before closing the connection. That is not something
that can be relied on. If the connection ends (cleanly or due to disconnect)
before the link is closed, then the router will confirm the close of the
connection before waiting for the detach it relays down the link route to be
echoed back.
If you get an attach with the same name before the detach for the previous use
of that name has been echoed back, then the previous link is not fully closed
(it is locally open, remotely closed), and when proton handles the attach it
gives the previous object which is in the incorrect state. This either leads to
the router incorrectly treating the attach as the echoing back of a router
initiated link, which causes the segfault described in this issue due to
correct context not being set up or it causes the attach to be ignored and not
echoed back. The former happens when the detach is echoed back slowly, so
running router c under valgrind makes it more likely.
Fundamentally I think this is an issue in using the same session for all routed
links, where the links are detached asynchronously.
> segfault in qdr_link_second_attach
> ----------------------------------
>
> Key: DISPATCH-994
> URL: https://issues.apache.org/jira/browse/DISPATCH-994
> Project: Qpid Dispatch
> Issue Type: Bug
> Affects Versions: 1.1.0
> Reporter: Gordon Sim
> Priority: Major
> Attachments: router-a.conf, router-b.conf, router-c.conf,
> topic_test.py
>
>
> Link routing from router A through router B to a 'broker', and closing and
> opening two receivers causes a segfault.
> {noformat}
> ==25674== Thread 4:
> ==25674== Invalid read of size 8
> ==25674== at 0x4E77EEF: qdr_link_second_attach (connections.c:474)
> ==25674== by 0x4E87142: AMQP_link_attach_handler (router_node.c:680)
> ==25674== by 0x4E8BF2B: handle (server.c:940)
> ==25674== by 0x4E8CBA7: thread_run (server.c:958)
> ==25674== by 0x54FA739: start_thread (in /usr/lib64/libpthread-2.24.so)
> ==25674== by 0x6288E7E: clone (in /usr/lib64/libc-2.24.so)
> ==25674== Address 0x10 is not stack'd, malloc'd or (recently) free'd
> ==25674==
> ==25674==
> ==25674== Process terminating with default action of signal 11 (SIGSEGV):
> dumping core
> ==25674== Access not within mapped region at address 0x10
> ==25674== at 0x4E77EEF: qdr_link_second_attach (connections.c:474)
> ==25674== by 0x4E87142: AMQP_link_attach_handler (router_node.c:680)
> ==25674== by 0x4E8BF2B: handle (server.c:940)
> ==25674== by 0x4E8CBA7: thread_run (server.c:958)
> ==25674== by 0x54FA739: start_thread (in /usr/lib64/libpthread-2.24.so)
> ==25674== by 0x6288E7E: clone (in /usr/lib64/libc-2.24.so)
> ==25674== If you believe this happened as a result of a stack
> ==25674== overflow in your program's main thread (unlikely but
> ==25674== possible), you can try to increase the size of the
> ==25674== main thread stack using the --main-stacksize= flag.
> ==25674== The main thread stack size used in this run was 8388608
> {noformat}
> To reproduce, start three routers with the attached config files, then run
> the attached python test program.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]