[
https://issues.apache.org/jira/browse/DISPATCH-902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16301679#comment-16301679
]
Alan Conway commented on DISPATCH-902:
--------------------------------------
Fixed PROTON-1727 which I believe is the root cause of this problem.
[~ganeshmurthy] can you run the dispatch tests to verify before resolving this
issue?
> Intermittent crash with link to broker when broker closed
> ---------------------------------------------------------
>
> Key: DISPATCH-902
> URL: https://issues.apache.org/jira/browse/DISPATCH-902
> Project: Qpid Dispatch
> Issue Type: Bug
> Affects Versions: 1.0.0
> Reporter: Kim van der Riet
> Assignee: Ganesh Murthy
> Priority: Blocker
> Fix For: 1.1.0
>
> Attachments: qdrouterd.node1.conf, qdrouterd.node2.conf,
> qpidd.d2n.conf, testme.tgz
>
>
> When using dispatch in a 2-node configuration with a broker between them:
> {noformat}
> 9002 10001 10001 9003
> sender ----> dispatch1 -----> qpid-cpp -----> dispatch2 -----> receiver
> {noformat}
> and initializing in the following order:
> # start dispatch1
> # start dispatch2
> # start qpid-cpp
> # wait for "Link Route Activated" messages on both dispatch nodes
> # stop qpid-cpp
> then the dispatch nodes will core after a random amount of time and after
> sending a random number of
> {noformat}
> (info) Connection to localhost:10001 failed: proton:io Connection refused -
> on read from localhost:10001
> {noformat}
> messages.
> The stack trace is as follows for all occurrences:
> {noformat}
> Thread 3 "qdrouterd" received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 0x7fffea269700 (LWP 10954)]
> pn_transport_tail_closed (transport=0x0) at
> /home/kpvdr/RedHat/qpid-proton/proton-c/src/core/transport.c:3044
> 3044 bool pn_transport_tail_closed(pn_transport_t *transport) { return
> transport->tail_closed; }
> (gdb) thread apply all bt
> Thread 5 (Thread 0x7fffe9267700 (LWP 10956)):
> #0 0x00007ffff67eb6d3 in epoll_wait () at
> ../sysdeps/unix/syscall-template.S:84
> #1 0x00007ffff77327e2 in proactor_do_epoll (p=0x89b550,
> can_block=can_block@entry=true) at
> /home/kpvdr/RedHat/qpid-proton/proton-c/src/proactor/epoll.c:1978
> #2 0x00007ffff77337ca in pn_proactor_wait (p=<optimized out>) at
> /home/kpvdr/RedHat/qpid-proton/proton-c/src/proactor/epoll.c:2025
> #3 0x00007ffff7bbc219 in thread_run (arg=0x89ec20) at
> /home/kpvdr/RedHat/qpid-dispatch/src/server.c:932
> #4 0x00007ffff75185ca in start_thread (arg=0x7fffe9267700) at
> pthread_create.c:333
> #5 0x00007ffff67eb0cd in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
> Thread 4 (Thread 0x7fffe9a68700 (LWP 10955)):
> #0 0x00007ffff67eb6d3 in epoll_wait () at
> ../sysdeps/unix/syscall-template.S:84
> #1 0x00007ffff77327e2 in proactor_do_epoll (p=0x89b550,
> can_block=can_block@entry=true) at
> /home/kpvdr/RedHat/qpid-proton/proton-c/src/proactor/epoll.c:1978
> #2 0x00007ffff77337ca in pn_proactor_wait (p=<optimized out>) at
> /home/kpvdr/RedHat/qpid-proton/proton-c/src/proactor/epoll.c:2025
> #3 0x00007ffff7bbc219 in thread_run (arg=0x89ec20) at
> /home/kpvdr/RedHat/qpid-dispatch/src/server.c:932
> #4 0x00007ffff75185ca in start_thread (arg=0x7fffe9a68700) at
> pthread_create.c:333
> #5 0x00007ffff67eb0cd in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
> Thread 3 (Thread 0x7fffea269700 (LWP 10954)):
> #0 pn_transport_tail_closed (transport=0x0) at
> /home/kpvdr/RedHat/qpid-proton/proton-c/src/core/transport.c:3044
> #1 0x00007ffff794f4f9 in pn_connection_driver_read_closed
> (d=d@entry=0x7fffdc054288) at
> /home/kpvdr/RedHat/qpid-proton/proton-c/src/core/connection_driver.c:109
> #2 0x00007ffff7731ef1 in pconnection_rclosed (pc=0x7fffdc053ce0) at
> /home/kpvdr/RedHat/qpid-proton/proton-c/src/proactor/epoll.c:898
> #3 pconnection_process (pc=0x7fffdc053ce0, events=<optimized out>,
> timeout=timeout@entry=false, topup=topup@entry=false) at
> /home/kpvdr/RedHat/qpid-proton/proton-c/src/proactor/epoll.c:1084
> #4 0x00007ffff7732945 in proactor_do_epoll (p=0x89b550,
> can_block=can_block@entry=true) at
> /home/kpvdr/RedHat/qpid-proton/proton-c/src/proactor/epoll.c:2007
> #5 0x00007ffff77337ca in pn_proactor_wait (p=<optimized out>) at
> /home/kpvdr/RedHat/qpid-proton/proton-c/src/proactor/epoll.c:2025
> #6 0x00007ffff7bbc219 in thread_run (arg=0x89ec20) at
> /home/kpvdr/RedHat/qpid-dispatch/src/server.c:932
> #7 0x00007ffff75185ca in start_thread (arg=0x7fffea269700) at
> pthread_create.c:333
> #8 0x00007ffff67eb0cd in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
> Thread 2 (Thread 0x7fffeaa6a700 (LWP 10953)):
> #0 pthread_cond_wait@@GLIBC_2.3.2 () at
> ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
> #1 0x00007ffff7ba2949 in sys_cond_wait (cond=<optimized out>,
> held_mutex=<optimized out>) at
> /home/kpvdr/RedHat/qpid-dispatch/src/posix/threading.c:91
> #2 0x00007ffff7bb0cf5 in router_core_thread (arg=0x8f8c90) at
> /home/kpvdr/RedHat/qpid-dispatch/src/router_core/router_core_thread.c:66
> #3 0x00007ffff75185ca in start_thread (arg=0x7fffeaa6a700) at
> pthread_create.c:333
> #4 0x00007ffff67eb0cd in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
> Thread 1 (Thread 0x7ffff7fbb180 (LWP 10946)):
> #0 0x00007ffff67eb6d3 in epoll_wait () at
> ../sysdeps/unix/syscall-template.S:84
> #1 0x00007ffff77327e2 in proactor_do_epoll (p=0x89b550,
> can_block=can_block@entry=true) at
> /home/kpvdr/RedHat/qpid-proton/proton-c/src/proactor/epoll.c:1978
> #2 0x00007ffff77337ca in pn_proactor_wait (p=<optimized out>) at
> /home/kpvdr/RedHat/qpid-proton/proton-c/src/proactor/epoll.c:2025
> #3 0x00007ffff7bbc219 in thread_run (arg=arg@entry=0x89ec20) at
> /home/kpvdr/RedHat/qpid-dispatch/src/server.c:932
> #4 0x00007ffff7bbc2f0 in qd_server_run (qd=<optimized out>) at
> /home/kpvdr/RedHat/qpid-dispatch/src/server.c:1186
> #5 0x00000000004017dc in main_process (config_path=0x7fffffffda56
> "/home/kpvdr/RedHat/install/etc/qpid-dispatch/qdrouterd.node2.conf",
> python_pkgdir=<optimized out>, fd=2)
> at /home/kpvdr/RedHat/qpid-dispatch/router/src/main.c:111
> #6 0x00000000004015ec in main (argc=3, argv=0x7fffffffd638) at
> /home/kpvdr/RedHat/qpid-dispatch/router/src/main.c:318
> {noformat}
> More detail:
>
> {noformat}
> (gdb) bt full
> #0 pn_transport_tail_closed (transport=0x0) at
> /home/kpvdr/RedHat/qpid-proton/proton-c/src/core/transport.c:3044
> No locals.
> #1 0x00007ffff794f4f9 in pn_connection_driver_read_closed
> (d=d@entry=0x7fffe0071108) at
> /home/kpvdr/RedHat/qpid-proton/proton-c/src/core/connection_driver.c:109
> No locals.
> #2 0x00007ffff7731ef1 in pconnection_rclosed (pc=0x7fffe0070b60) at
> /home/kpvdr/RedHat/qpid-proton/proton-c/src/proactor/epoll.c:898
> No locals.
> #3 pconnection_process (pc=0x7fffe0070b60, events=<optimized out>,
> timeout=timeout@entry=false, topup=topup@entry=false) at
> /home/kpvdr/RedHat/qpid-proton/proton-c/src/proactor/epoll.c:1084
> inbound_wake = <optimized out>
> rearm_timer = <optimized out>
> timer_fired = <optimized out>
> waking = false
> tick_required = false
> rearm_pc = <optimized out>
> #4 0x00007ffff7732945 in proactor_do_epoll (p=0x89b550,
> can_block=can_block@entry=true) at
> /home/kpvdr/RedHat/qpid-proton/proton-c/src/proactor/epoll.c:2007
> batch = 0x0
> ev = {events = 21, data = {ptr = 0x7fffe0070b70, fd = -536409232, u32
> = 3758558064, u64 = 140736951946096}}
> n = <optimized out>
> ee = 0x7fffe0070b70
> timeout = -1
> #5 0x00007ffff77337ca in pn_proactor_wait (p=<optimized out>) at
> /home/kpvdr/RedHat/qpid-proton/proton-c/src/proactor/epoll.c:2025
> No locals.
> #6 0x00007ffff7bbc219 in thread_run (arg=0x89ec20) at
> /home/kpvdr/RedHat/qpid-dispatch/src/server.c:932
> events = <optimized out>
> e = <optimized out>
> qd_server = 0x89ec20
> running = true
> #7 0x00007ffff75185ca in start_thread (arg=0x7fffe9a68700) at
> pthread_create.c:333
> __res = <optimized out>
> pd = 0x7fffe9a68700
> now = <optimized out>
> unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140737113392896,
> -7680037156526760930, 140737488344079, 4096, 140737113392896,
> 140737113393600, 7679998188122266654, 7680018654140947486},
> mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data
> = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
> not_first_call = <optimized out>
> pagesize_m1 = <optimized out>
> sp = <optimized out>
> freesize = <optimized out>
> #8 0x00007ffff67eb0cd in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
> No locals.
> {noformat}
> Dispatch, qpid-cpp and Proton are all built from master yesterday (Dec 14).
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]