[jira] [Commented] (PROTON-1842) [c] Dispatch/Proton crashes when opening/closing connections

2018-08-17 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/PROTON-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584025#comment-16584025
 ] 

ASF GitHub Bot commented on PROTON-1842:


Github user cliffjansen commented on the issue:

https://github.com/apache/qpid-proton/pull/145
  
Substantially adopted but with similar valgrind cleanups as in 37cd347.


> [c] Dispatch/Proton crashes when opening/closing connections
> 
>
> Key: PROTON-1842
> URL: https://issues.apache.org/jira/browse/PROTON-1842
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.22.0
>Reporter: Chuck Rolke
>Assignee: Alan Conway
>Priority: Major
> Fix For: proton-c-0.25.0
>
> Attachments: helloworld.cpp, race.tsan, race.vg
>
>
> Using proton cpp example code that is modified to open and close connections 
> by the thousands in the main loop and having the event loop short circuit any 
> messaging with:
> {{  void on_connection_open(proton::connection& c) {}}
> {{  c.close();}}
> {{  }}}
> and then directing this client example to a dispatch router 1.1.0. Eventually 
> (after 100,000 to 1,000,000 connection open/closes) the router crashes with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:466: 
> wake_pop_front: Assertion `p->wakes_in_progress' failed.}}
> and with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014: 
> proactor_do_epoll: Assertion `ee->type == PCONNECTION_TIMER' failed.}}
> This issue seems to happen only with qpid-dispatch accepting the open/close 
> event stream. Proton cpp example _server_direct_ and c example _direct_ work 
> properly with the same open/close event stream mounting into the 10s of 
> millions of connections.
> A core dump backtrace with the PCONNECTION_TIMER failure reads as:
> {{(gdb) bt}}
> {{#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51}}
> {{#1  0x7f795c712c41 in __GI_abort () at abort.c:79}}
> {{#2  0x7f795c709f7a in __assert_fail_base (fmt=0x7f795c85a260 
> "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
> assertion=assertion@entry=0x7f795d72e15a "ee->type == PCONNECTION_TIMER", }}
> {{    file=file@entry=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=line@entry=2014, }}
> {{    function=function@entry=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> 
> "proactor_do_epoll") at assert.c:92}}
> {{#3  0x7f795c709ff2 in __GI___assert_fail (assertion=0x7f795d72e15a 
> "ee->type == PCONNECTION_TIMER", file=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=2014, }}
> {{    function=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> "proactor_do_epoll") 
> at assert.c:101}}
> {{#4  0x7f795d72d29f in proactor_do_epoll (p=0x26b7310, can_block=true) 
> at /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014}}
> {{#5  0x7f795d72d30e in pn_proactor_wait (p=0x26b7310) at 
> /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2030}}
> {{#6  0x7f795dbe89ad in thread_run (arg=0x26be750) at 
> /home/chug/git/qpid-dispatch/src/server.c:946}}
> {{#7  0x7f795d50e50b in start_thread (arg=0x7f794f486700) at 
> pthread_create.c:465}}
> {{#8  0x7f795c7d216f in clone () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95}}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-1842) [c] Dispatch/Proton crashes when opening/closing connections

2018-08-17 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/PROTON-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584020#comment-16584020
 ] 

ASF GitHub Bot commented on PROTON-1842:


Github user cliffjansen closed the pull request at:

https://github.com/apache/qpid-proton/pull/145


> [c] Dispatch/Proton crashes when opening/closing connections
> 
>
> Key: PROTON-1842
> URL: https://issues.apache.org/jira/browse/PROTON-1842
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.22.0
>Reporter: Chuck Rolke
>Assignee: Alan Conway
>Priority: Major
> Fix For: proton-c-0.25.0
>
> Attachments: helloworld.cpp, race.tsan, race.vg
>
>
> Using proton cpp example code that is modified to open and close connections 
> by the thousands in the main loop and having the event loop short circuit any 
> messaging with:
> {{  void on_connection_open(proton::connection& c) {}}
> {{  c.close();}}
> {{  }}}
> and then directing this client example to a dispatch router 1.1.0. Eventually 
> (after 100,000 to 1,000,000 connection open/closes) the router crashes with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:466: 
> wake_pop_front: Assertion `p->wakes_in_progress' failed.}}
> and with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014: 
> proactor_do_epoll: Assertion `ee->type == PCONNECTION_TIMER' failed.}}
> This issue seems to happen only with qpid-dispatch accepting the open/close 
> event stream. Proton cpp example _server_direct_ and c example _direct_ work 
> properly with the same open/close event stream mounting into the 10s of 
> millions of connections.
> A core dump backtrace with the PCONNECTION_TIMER failure reads as:
> {{(gdb) bt}}
> {{#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51}}
> {{#1  0x7f795c712c41 in __GI_abort () at abort.c:79}}
> {{#2  0x7f795c709f7a in __assert_fail_base (fmt=0x7f795c85a260 
> "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
> assertion=assertion@entry=0x7f795d72e15a "ee->type == PCONNECTION_TIMER", }}
> {{    file=file@entry=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=line@entry=2014, }}
> {{    function=function@entry=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> 
> "proactor_do_epoll") at assert.c:92}}
> {{#3  0x7f795c709ff2 in __GI___assert_fail (assertion=0x7f795d72e15a 
> "ee->type == PCONNECTION_TIMER", file=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=2014, }}
> {{    function=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> "proactor_do_epoll") 
> at assert.c:101}}
> {{#4  0x7f795d72d29f in proactor_do_epoll (p=0x26b7310, can_block=true) 
> at /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014}}
> {{#5  0x7f795d72d30e in pn_proactor_wait (p=0x26b7310) at 
> /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2030}}
> {{#6  0x7f795dbe89ad in thread_run (arg=0x26be750) at 
> /home/chug/git/qpid-dispatch/src/server.c:946}}
> {{#7  0x7f795d50e50b in start_thread (arg=0x7f794f486700) at 
> pthread_create.c:465}}
> {{#8  0x7f795c7d216f in clone () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95}}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-1842) [c] Dispatch/Proton crashes when opening/closing connections

2018-08-15 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/PROTON-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580753#comment-16580753
 ] 

ASF subversion and git services commented on PROTON-1842:
-

Commit c601b98618dd4f45041501b080f761db4dd8285f in qpid-proton's branch 
refs/heads/master from Clifford Jansen
[ https://git-wip-us.apache.org/repos/asf?p=qpid-proton.git;h=c601b98 ]

PROTON-1842: revert 79d9019 temporary mitigation for follow-on long term fix


> [c] Dispatch/Proton crashes when opening/closing connections
> 
>
> Key: PROTON-1842
> URL: https://issues.apache.org/jira/browse/PROTON-1842
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.22.0
>Reporter: Chuck Rolke
>Assignee: Alan Conway
>Priority: Major
> Attachments: helloworld.cpp, race.tsan, race.vg
>
>
> Using proton cpp example code that is modified to open and close connections 
> by the thousands in the main loop and having the event loop short circuit any 
> messaging with:
> {{  void on_connection_open(proton::connection& c) {}}
> {{  c.close();}}
> {{  }}}
> and then directing this client example to a dispatch router 1.1.0. Eventually 
> (after 100,000 to 1,000,000 connection open/closes) the router crashes with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:466: 
> wake_pop_front: Assertion `p->wakes_in_progress' failed.}}
> and with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014: 
> proactor_do_epoll: Assertion `ee->type == PCONNECTION_TIMER' failed.}}
> This issue seems to happen only with qpid-dispatch accepting the open/close 
> event stream. Proton cpp example _server_direct_ and c example _direct_ work 
> properly with the same open/close event stream mounting into the 10s of 
> millions of connections.
> A core dump backtrace with the PCONNECTION_TIMER failure reads as:
> {{(gdb) bt}}
> {{#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51}}
> {{#1  0x7f795c712c41 in __GI_abort () at abort.c:79}}
> {{#2  0x7f795c709f7a in __assert_fail_base (fmt=0x7f795c85a260 
> "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
> assertion=assertion@entry=0x7f795d72e15a "ee->type == PCONNECTION_TIMER", }}
> {{    file=file@entry=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=line@entry=2014, }}
> {{    function=function@entry=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> 
> "proactor_do_epoll") at assert.c:92}}
> {{#3  0x7f795c709ff2 in __GI___assert_fail (assertion=0x7f795d72e15a 
> "ee->type == PCONNECTION_TIMER", file=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=2014, }}
> {{    function=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> "proactor_do_epoll") 
> at assert.c:101}}
> {{#4  0x7f795d72d29f in proactor_do_epoll (p=0x26b7310, can_block=true) 
> at /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014}}
> {{#5  0x7f795d72d30e in pn_proactor_wait (p=0x26b7310) at 
> /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2030}}
> {{#6  0x7f795dbe89ad in thread_run (arg=0x26be750) at 
> /home/chug/git/qpid-dispatch/src/server.c:946}}
> {{#7  0x7f795d50e50b in start_thread (arg=0x7f794f486700) at 
> pthread_create.c:465}}
> {{#8  0x7f795c7d216f in clone () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95}}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-1842) [c] Dispatch/Proton crashes when opening/closing connections

2018-08-15 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/PROTON-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580754#comment-16580754
 ] 

ASF subversion and git services commented on PROTON-1842:
-

Commit 803a47ed386497f6467dd4f5fb4f9f57d545695d in qpid-proton's branch 
refs/heads/master from Clifford Jansen
[ https://git-wip-us.apache.org/repos/asf?p=qpid-proton.git;h=803a47e ]

PROTON-1842: epoll proactor - add secondary/chained epollfd to maintain 1-1 
count between epoll registrations and eventual callbacks on pconnections


> [c] Dispatch/Proton crashes when opening/closing connections
> 
>
> Key: PROTON-1842
> URL: https://issues.apache.org/jira/browse/PROTON-1842
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.22.0
>Reporter: Chuck Rolke
>Assignee: Alan Conway
>Priority: Major
> Attachments: helloworld.cpp, race.tsan, race.vg
>
>
> Using proton cpp example code that is modified to open and close connections 
> by the thousands in the main loop and having the event loop short circuit any 
> messaging with:
> {{  void on_connection_open(proton::connection& c) {}}
> {{  c.close();}}
> {{  }}}
> and then directing this client example to a dispatch router 1.1.0. Eventually 
> (after 100,000 to 1,000,000 connection open/closes) the router crashes with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:466: 
> wake_pop_front: Assertion `p->wakes_in_progress' failed.}}
> and with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014: 
> proactor_do_epoll: Assertion `ee->type == PCONNECTION_TIMER' failed.}}
> This issue seems to happen only with qpid-dispatch accepting the open/close 
> event stream. Proton cpp example _server_direct_ and c example _direct_ work 
> properly with the same open/close event stream mounting into the 10s of 
> millions of connections.
> A core dump backtrace with the PCONNECTION_TIMER failure reads as:
> {{(gdb) bt}}
> {{#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51}}
> {{#1  0x7f795c712c41 in __GI_abort () at abort.c:79}}
> {{#2  0x7f795c709f7a in __assert_fail_base (fmt=0x7f795c85a260 
> "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
> assertion=assertion@entry=0x7f795d72e15a "ee->type == PCONNECTION_TIMER", }}
> {{    file=file@entry=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=line@entry=2014, }}
> {{    function=function@entry=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> 
> "proactor_do_epoll") at assert.c:92}}
> {{#3  0x7f795c709ff2 in __GI___assert_fail (assertion=0x7f795d72e15a 
> "ee->type == PCONNECTION_TIMER", file=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=2014, }}
> {{    function=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> "proactor_do_epoll") 
> at assert.c:101}}
> {{#4  0x7f795d72d29f in proactor_do_epoll (p=0x26b7310, can_block=true) 
> at /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014}}
> {{#5  0x7f795d72d30e in pn_proactor_wait (p=0x26b7310) at 
> /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2030}}
> {{#6  0x7f795dbe89ad in thread_run (arg=0x26be750) at 
> /home/chug/git/qpid-dispatch/src/server.c:946}}
> {{#7  0x7f795d50e50b in start_thread (arg=0x7f794f486700) at 
> pthread_create.c:465}}
> {{#8  0x7f795c7d216f in clone () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95}}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-1842) [c] Dispatch/Proton crashes when opening/closing connections

2018-08-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/PROTON-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16567461#comment-16567461
 ] 

ASF GitHub Bot commented on PROTON-1842:


Github user alanconway commented on the issue:

https://github.com/apache/qpid-proton/pull/145
  
I think this makes sense but it makes my head hurt a bit.


> [c] Dispatch/Proton crashes when opening/closing connections
> 
>
> Key: PROTON-1842
> URL: https://issues.apache.org/jira/browse/PROTON-1842
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.22.0
>Reporter: Chuck Rolke
>Assignee: Alan Conway
>Priority: Major
> Attachments: helloworld.cpp, race.tsan, race.vg
>
>
> Using proton cpp example code that is modified to open and close connections 
> by the thousands in the main loop and having the event loop short circuit any 
> messaging with:
> {{  void on_connection_open(proton::connection& c) {}}
> {{  c.close();}}
> {{  }}}
> and then directing this client example to a dispatch router 1.1.0. Eventually 
> (after 100,000 to 1,000,000 connection open/closes) the router crashes with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:466: 
> wake_pop_front: Assertion `p->wakes_in_progress' failed.}}
> and with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014: 
> proactor_do_epoll: Assertion `ee->type == PCONNECTION_TIMER' failed.}}
> This issue seems to happen only with qpid-dispatch accepting the open/close 
> event stream. Proton cpp example _server_direct_ and c example _direct_ work 
> properly with the same open/close event stream mounting into the 10s of 
> millions of connections.
> A core dump backtrace with the PCONNECTION_TIMER failure reads as:
> {{(gdb) bt}}
> {{#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51}}
> {{#1  0x7f795c712c41 in __GI_abort () at abort.c:79}}
> {{#2  0x7f795c709f7a in __assert_fail_base (fmt=0x7f795c85a260 
> "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
> assertion=assertion@entry=0x7f795d72e15a "ee->type == PCONNECTION_TIMER", }}
> {{    file=file@entry=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=line@entry=2014, }}
> {{    function=function@entry=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> 
> "proactor_do_epoll") at assert.c:92}}
> {{#3  0x7f795c709ff2 in __GI___assert_fail (assertion=0x7f795d72e15a 
> "ee->type == PCONNECTION_TIMER", file=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=2014, }}
> {{    function=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> "proactor_do_epoll") 
> at assert.c:101}}
> {{#4  0x7f795d72d29f in proactor_do_epoll (p=0x26b7310, can_block=true) 
> at /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014}}
> {{#5  0x7f795d72d30e in pn_proactor_wait (p=0x26b7310) at 
> /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2030}}
> {{#6  0x7f795dbe89ad in thread_run (arg=0x26be750) at 
> /home/chug/git/qpid-dispatch/src/server.c:946}}
> {{#7  0x7f795d50e50b in start_thread (arg=0x7f794f486700) at 
> pthread_create.c:465}}
> {{#8  0x7f795c7d216f in clone () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95}}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-1842) [c] Dispatch/Proton crashes when opening/closing connections

2018-07-03 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/PROTON-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16532012#comment-16532012
 ] 

ASF subversion and git services commented on PROTON-1842:
-

Commit 79d901959b8dca060d76e627f65256a66abb0778 in qpid-proton's branch 
refs/heads/go1 from Clifford Jansen
[ https://git-wip-us.apache.org/repos/asf?p=qpid-proton.git;h=79d9019 ]

PROTON-1842: defer connection cleanup to reduce likelyhood of memory corruption 
on shutdown.
Already rare, now a thousand times more rare.  Comprehensive fix still in 
progress.


> [c] Dispatch/Proton crashes when opening/closing connections
> 
>
> Key: PROTON-1842
> URL: https://issues.apache.org/jira/browse/PROTON-1842
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.22.0
>Reporter: Chuck Rolke
>Assignee: Alan Conway
>Priority: Major
> Attachments: helloworld.cpp, race.tsan, race.vg
>
>
> Using proton cpp example code that is modified to open and close connections 
> by the thousands in the main loop and having the event loop short circuit any 
> messaging with:
> {{  void on_connection_open(proton::connection& c) {}}
> {{  c.close();}}
> {{  }}}
> and then directing this client example to a dispatch router 1.1.0. Eventually 
> (after 100,000 to 1,000,000 connection open/closes) the router crashes with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:466: 
> wake_pop_front: Assertion `p->wakes_in_progress' failed.}}
> and with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014: 
> proactor_do_epoll: Assertion `ee->type == PCONNECTION_TIMER' failed.}}
> This issue seems to happen only with qpid-dispatch accepting the open/close 
> event stream. Proton cpp example _server_direct_ and c example _direct_ work 
> properly with the same open/close event stream mounting into the 10s of 
> millions of connections.
> A core dump backtrace with the PCONNECTION_TIMER failure reads as:
> {{(gdb) bt}}
> {{#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51}}
> {{#1  0x7f795c712c41 in __GI_abort () at abort.c:79}}
> {{#2  0x7f795c709f7a in __assert_fail_base (fmt=0x7f795c85a260 
> "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
> assertion=assertion@entry=0x7f795d72e15a "ee->type == PCONNECTION_TIMER", }}
> {{    file=file@entry=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=line@entry=2014, }}
> {{    function=function@entry=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> 
> "proactor_do_epoll") at assert.c:92}}
> {{#3  0x7f795c709ff2 in __GI___assert_fail (assertion=0x7f795d72e15a 
> "ee->type == PCONNECTION_TIMER", file=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=2014, }}
> {{    function=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> "proactor_do_epoll") 
> at assert.c:101}}
> {{#4  0x7f795d72d29f in proactor_do_epoll (p=0x26b7310, can_block=true) 
> at /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014}}
> {{#5  0x7f795d72d30e in pn_proactor_wait (p=0x26b7310) at 
> /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2030}}
> {{#6  0x7f795dbe89ad in thread_run (arg=0x26be750) at 
> /home/chug/git/qpid-dispatch/src/server.c:946}}
> {{#7  0x7f795d50e50b in start_thread (arg=0x7f794f486700) at 
> pthread_create.c:465}}
> {{#8  0x7f795c7d216f in clone () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95}}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-1842) [c] Dispatch/Proton crashes when opening/closing connections

2018-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PROTON-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489339#comment-16489339
 ] 

ASF GitHub Bot commented on PROTON-1842:


GitHub user cliffjansen opened a pull request:

https://github.com/apache/qpid-proton/pull/145

Potential chained epoll impl for PROTON-1842

Removes temp fix, adds chained/secondary epollfd_2.

Still has failure in helgrind threaderciser racecheck.  Still unsure if a 
problem or false positive.

Previous temp fix also has the same threaderciser error, so at least not a 
regression.

I have not been able to detect any change in performance with my various 
epoll proactor load tests.

Instrumented runs show the secondary/chained arming occurs about 0.1% in 
proton ctest.  It is closer to 0.001% in dispatch router runs under heavy load, 
but that may be less indicative of the overall frequency.

Chaining is unlikely to occur if socket output does not fill the kernel 
buffer (!EWOULDBLOCK), or if a write event mask is desired as a result of input 
(say a flow event) and EPOLLIN is also necessary with the EPOLLOUT.

Chaining *is* likely to occur if the socket is quiet (i.e. no recent output 
and waiting on EPOLLIN), and the pconnection gets a pn_connection_wake to do 
enough output to fill the kernel output buffer.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cliffjansen/qpid-proton p1842_1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/qpid-proton/pull/145.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #145


commit 1f43f37149e56180424ba0434a98d812ed7ebfb7
Author: Cliff Jansen 
Date:   2018-05-23T21:45:56Z

PROTON-1842: revert 79d9019 temporary mitigation for follow-on long term fix

commit ca5c1b55f58aecab5be3c19e14de1d799f92307d
Author: Cliff Jansen 
Date:   2018-05-24T15:46:37Z

PROTON-1842: epoll proactor - add secondary/chained epollfd to maintain 1-1 
count between epoll registrations and eventual callbacks on pconnections




> [c] Dispatch/Proton crashes when opening/closing connections
> 
>
> Key: PROTON-1842
> URL: https://issues.apache.org/jira/browse/PROTON-1842
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.22.0
>Reporter: Chuck Rolke
>Assignee: Alan Conway
>Priority: Major
> Attachments: helloworld.cpp, race.tsan, race.vg
>
>
> Using proton cpp example code that is modified to open and close connections 
> by the thousands in the main loop and having the event loop short circuit any 
> messaging with:
> {{  void on_connection_open(proton::connection& c) {}}
> {{  c.close();}}
> {{  }}}
> and then directing this client example to a dispatch router 1.1.0. Eventually 
> (after 100,000 to 1,000,000 connection open/closes) the router crashes with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:466: 
> wake_pop_front: Assertion `p->wakes_in_progress' failed.}}
> and with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014: 
> proactor_do_epoll: Assertion `ee->type == PCONNECTION_TIMER' failed.}}
> This issue seems to happen only with qpid-dispatch accepting the open/close 
> event stream. Proton cpp example _server_direct_ and c example _direct_ work 
> properly with the same open/close event stream mounting into the 10s of 
> millions of connections.
> A core dump backtrace with the PCONNECTION_TIMER failure reads as:
> {{(gdb) bt}}
> {{#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51}}
> {{#1  0x7f795c712c41 in __GI_abort () at abort.c:79}}
> {{#2  0x7f795c709f7a in __assert_fail_base (fmt=0x7f795c85a260 
> "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
> assertion=assertion@entry=0x7f795d72e15a "ee->type == PCONNECTION_TIMER", }}
> {{    file=file@entry=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=line@entry=2014, }}
> {{    function=function@entry=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> 
> "proactor_do_epoll") at assert.c:92}}
> {{#3  0x7f795c709ff2 in __GI___assert_fail (assertion=0x7f795d72e15a 
> "ee->type == PCONNECTION_TIMER", file=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=2014, }}
> {{    function=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> "proactor_do_epoll") 
> at assert.c:101}}
> {{#4  0x7f795d72d29f in proactor_do_epoll (p=0x26b7310, can_block=true) 
> at /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014}}
> {{#5  0x7f795d72d30e in pn_proactor_wait (p=0x26b7310) at 
> 

[jira] [Commented] (PROTON-1842) [c] Dispatch/Proton crashes when opening/closing connections

2018-05-17 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PROTON-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479344#comment-16479344
 ] 

ASF subversion and git services commented on PROTON-1842:
-

Commit 79d901959b8dca060d76e627f65256a66abb0778 in qpid-proton's branch 
refs/heads/master from Clifford Jansen
[ https://git-wip-us.apache.org/repos/asf?p=qpid-proton.git;h=79d9019 ]

PROTON-1842: defer connection cleanup to reduce likelyhood of memory corruption 
on shutdown.
Already rare, now a thousand times more rare.  Comprehensive fix still in 
progress.


> [c] Dispatch/Proton crashes when opening/closing connections
> 
>
> Key: PROTON-1842
> URL: https://issues.apache.org/jira/browse/PROTON-1842
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.22.0
>Reporter: Chuck Rolke
>Priority: Major
> Attachments: helloworld.cpp, race.tsan, race.vg
>
>
> Using proton cpp example code that is modified to open and close connections 
> by the thousands in the main loop and having the event loop short circuit any 
> messaging with:
> {{  void on_connection_open(proton::connection& c) {}}
> {{  c.close();}}
> {{  }}}
> and then directing this client example to a dispatch router 1.1.0. Eventually 
> (after 100,000 to 1,000,000 connection open/closes) the router crashes with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:466: 
> wake_pop_front: Assertion `p->wakes_in_progress' failed.}}
> and with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014: 
> proactor_do_epoll: Assertion `ee->type == PCONNECTION_TIMER' failed.}}
> This issue seems to happen only with qpid-dispatch accepting the open/close 
> event stream. Proton cpp example _server_direct_ and c example _direct_ work 
> properly with the same open/close event stream mounting into the 10s of 
> millions of connections.
> A core dump backtrace with the PCONNECTION_TIMER failure reads as:
> {{(gdb) bt}}
> {{#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51}}
> {{#1  0x7f795c712c41 in __GI_abort () at abort.c:79}}
> {{#2  0x7f795c709f7a in __assert_fail_base (fmt=0x7f795c85a260 
> "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
> assertion=assertion@entry=0x7f795d72e15a "ee->type == PCONNECTION_TIMER", }}
> {{    file=file@entry=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=line@entry=2014, }}
> {{    function=function@entry=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> 
> "proactor_do_epoll") at assert.c:92}}
> {{#3  0x7f795c709ff2 in __GI___assert_fail (assertion=0x7f795d72e15a 
> "ee->type == PCONNECTION_TIMER", file=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=2014, }}
> {{    function=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> "proactor_do_epoll") 
> at assert.c:101}}
> {{#4  0x7f795d72d29f in proactor_do_epoll (p=0x26b7310, can_block=true) 
> at /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014}}
> {{#5  0x7f795d72d30e in pn_proactor_wait (p=0x26b7310) at 
> /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2030}}
> {{#6  0x7f795dbe89ad in thread_run (arg=0x26be750) at 
> /home/chug/git/qpid-dispatch/src/server.c:946}}
> {{#7  0x7f795d50e50b in start_thread (arg=0x7f794f486700) at 
> pthread_create.c:465}}
> {{#8  0x7f795c7d216f in clone () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95}}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-1842) [c] Dispatch/Proton crashes when opening/closing connections

2018-05-13 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PROTON-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16473680#comment-16473680
 ] 

ASF subversion and git services commented on PROTON-1842:
-

Commit 1315f263823386f2a13767e16882d4da48329901 in qpid-proton's branch 
refs/heads/master from Clifford Jansen
[ https://git-wip-us.apache.org/repos/asf?p=qpid-proton.git;h=1315f26 ]

PROTON-1842: epoll driver cleanup needs memory barrier on parallel socket rearm


> [c] Dispatch/Proton crashes when opening/closing connections
> 
>
> Key: PROTON-1842
> URL: https://issues.apache.org/jira/browse/PROTON-1842
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.22.0
>Reporter: Chuck Rolke
>Priority: Major
> Attachments: helloworld.cpp, race.tsan, race.vg
>
>
> Using proton cpp example code that is modified to open and close connections 
> by the thousands in the main loop and having the event loop short circuit any 
> messaging with:
> {{  void on_connection_open(proton::connection& c) {}}
> {{  c.close();}}
> {{  }}}
> and then directing this client example to a dispatch router 1.1.0. Eventually 
> (after 100,000 to 1,000,000 connection open/closes) the router crashes with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:466: 
> wake_pop_front: Assertion `p->wakes_in_progress' failed.}}
> and with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014: 
> proactor_do_epoll: Assertion `ee->type == PCONNECTION_TIMER' failed.}}
> This issue seems to happen only with qpid-dispatch accepting the open/close 
> event stream. Proton cpp example _server_direct_ and c example _direct_ work 
> properly with the same open/close event stream mounting into the 10s of 
> millions of connections.
> A core dump backtrace with the PCONNECTION_TIMER failure reads as:
> {{(gdb) bt}}
> {{#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51}}
> {{#1  0x7f795c712c41 in __GI_abort () at abort.c:79}}
> {{#2  0x7f795c709f7a in __assert_fail_base (fmt=0x7f795c85a260 
> "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
> assertion=assertion@entry=0x7f795d72e15a "ee->type == PCONNECTION_TIMER", }}
> {{    file=file@entry=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=line@entry=2014, }}
> {{    function=function@entry=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> 
> "proactor_do_epoll") at assert.c:92}}
> {{#3  0x7f795c709ff2 in __GI___assert_fail (assertion=0x7f795d72e15a 
> "ee->type == PCONNECTION_TIMER", file=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=2014, }}
> {{    function=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> "proactor_do_epoll") 
> at assert.c:101}}
> {{#4  0x7f795d72d29f in proactor_do_epoll (p=0x26b7310, can_block=true) 
> at /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014}}
> {{#5  0x7f795d72d30e in pn_proactor_wait (p=0x26b7310) at 
> /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2030}}
> {{#6  0x7f795dbe89ad in thread_run (arg=0x26be750) at 
> /home/chug/git/qpid-dispatch/src/server.c:946}}
> {{#7  0x7f795d50e50b in start_thread (arg=0x7f794f486700) at 
> pthread_create.c:465}}
> {{#8  0x7f795c7d216f in clone () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95}}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-1842) [c] Dispatch/Proton crashes when opening/closing connections

2018-05-08 Thread Alan Conway (JIRA)

[ 
https://issues.apache.org/jira/browse/PROTON-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16468176#comment-16468176
 ] 

Alan Conway commented on PROTON-1842:
-

Another note, the latest threaderciser shows the race with flags "-listen 
-connect -close-listen" so the only things that are racing here are IO events 
from connection errors and procator-generated wakes - there are no user wakes 
involved.

> [c] Dispatch/Proton crashes when opening/closing connections
> 
>
> Key: PROTON-1842
> URL: https://issues.apache.org/jira/browse/PROTON-1842
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.22.0
>Reporter: Chuck Rolke
>Priority: Major
> Attachments: helloworld.cpp, race.tsan, race.vg
>
>
> Using proton cpp example code that is modified to open and close connections 
> by the thousands in the main loop and having the event loop short circuit any 
> messaging with:
> {{  void on_connection_open(proton::connection& c) {}}
> {{  c.close();}}
> {{  }}}
> and then directing this client example to a dispatch router 1.1.0. Eventually 
> (after 100,000 to 1,000,000 connection open/closes) the router crashes with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:466: 
> wake_pop_front: Assertion `p->wakes_in_progress' failed.}}
> and with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014: 
> proactor_do_epoll: Assertion `ee->type == PCONNECTION_TIMER' failed.}}
> This issue seems to happen only with qpid-dispatch accepting the open/close 
> event stream. Proton cpp example _server_direct_ and c example _direct_ work 
> properly with the same open/close event stream mounting into the 10s of 
> millions of connections.
> A core dump backtrace with the PCONNECTION_TIMER failure reads as:
> {{(gdb) bt}}
> {{#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51}}
> {{#1  0x7f795c712c41 in __GI_abort () at abort.c:79}}
> {{#2  0x7f795c709f7a in __assert_fail_base (fmt=0x7f795c85a260 
> "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
> assertion=assertion@entry=0x7f795d72e15a "ee->type == PCONNECTION_TIMER", }}
> {{    file=file@entry=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=line@entry=2014, }}
> {{    function=function@entry=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> 
> "proactor_do_epoll") at assert.c:92}}
> {{#3  0x7f795c709ff2 in __GI___assert_fail (assertion=0x7f795d72e15a 
> "ee->type == PCONNECTION_TIMER", file=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=2014, }}
> {{    function=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> "proactor_do_epoll") 
> at assert.c:101}}
> {{#4  0x7f795d72d29f in proactor_do_epoll (p=0x26b7310, can_block=true) 
> at /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014}}
> {{#5  0x7f795d72d30e in pn_proactor_wait (p=0x26b7310) at 
> /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2030}}
> {{#6  0x7f795dbe89ad in thread_run (arg=0x26be750) at 
> /home/chug/git/qpid-dispatch/src/server.c:946}}
> {{#7  0x7f795d50e50b in start_thread (arg=0x7f794f486700) at 
> pthread_create.c:465}}
> {{#8  0x7f795c7d216f in clone () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95}}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-1842) [c] Dispatch/Proton crashes when opening/closing connections

2018-05-08 Thread Alan Conway (JIRA)

[ 
https://issues.apache.org/jira/browse/PROTON-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16467806#comment-16467806
 ] 

Alan Conway commented on PROTON-1842:
-

Note that the only way connections get closed in this version of the 
threaderciser is because a listener was closed and the connection failed 
(nobody listening, bad port, socket closed unexpectedly) so at least in the 
threaderciser version this happening as a result of an early error while the 
connection is possibly not completely set up. I'm adding socket kills to the 
connection socket now, will let you know how that goes.

> [c] Dispatch/Proton crashes when opening/closing connections
> 
>
> Key: PROTON-1842
> URL: https://issues.apache.org/jira/browse/PROTON-1842
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.22.0
>Reporter: Chuck Rolke
>Priority: Major
> Attachments: helloworld.cpp, race.tsan, race.vg
>
>
> Using proton cpp example code that is modified to open and close connections 
> by the thousands in the main loop and having the event loop short circuit any 
> messaging with:
> {{  void on_connection_open(proton::connection& c) {}}
> {{  c.close();}}
> {{  }}}
> and then directing this client example to a dispatch router 1.1.0. Eventually 
> (after 100,000 to 1,000,000 connection open/closes) the router crashes with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:466: 
> wake_pop_front: Assertion `p->wakes_in_progress' failed.}}
> and with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014: 
> proactor_do_epoll: Assertion `ee->type == PCONNECTION_TIMER' failed.}}
> This issue seems to happen only with qpid-dispatch accepting the open/close 
> event stream. Proton cpp example _server_direct_ and c example _direct_ work 
> properly with the same open/close event stream mounting into the 10s of 
> millions of connections.
> A core dump backtrace with the PCONNECTION_TIMER failure reads as:
> {{(gdb) bt}}
> {{#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51}}
> {{#1  0x7f795c712c41 in __GI_abort () at abort.c:79}}
> {{#2  0x7f795c709f7a in __assert_fail_base (fmt=0x7f795c85a260 
> "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
> assertion=assertion@entry=0x7f795d72e15a "ee->type == PCONNECTION_TIMER", }}
> {{    file=file@entry=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=line@entry=2014, }}
> {{    function=function@entry=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> 
> "proactor_do_epoll") at assert.c:92}}
> {{#3  0x7f795c709ff2 in __GI___assert_fail (assertion=0x7f795d72e15a 
> "ee->type == PCONNECTION_TIMER", file=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=2014, }}
> {{    function=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> "proactor_do_epoll") 
> at assert.c:101}}
> {{#4  0x7f795d72d29f in proactor_do_epoll (p=0x26b7310, can_block=true) 
> at /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014}}
> {{#5  0x7f795d72d30e in pn_proactor_wait (p=0x26b7310) at 
> /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2030}}
> {{#6  0x7f795dbe89ad in thread_run (arg=0x26be750) at 
> /home/chug/git/qpid-dispatch/src/server.c:946}}
> {{#7  0x7f795d50e50b in start_thread (arg=0x7f794f486700) at 
> pthread_create.c:465}}
> {{#8  0x7f795c7d216f in clone () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95}}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-1842) [c] Dispatch/Proton crashes when opening/closing connections

2018-05-08 Thread Cliff Jansen (JIRA)

[ 
https://issues.apache.org/jira/browse/PROTON-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16467772#comment-16467772
 ] 

Cliff Jansen commented on PROTON-1842:
--

Thank-you for this additional info.

Yes, these look like the same types of stack traces after a dust up.

If they truly are, for me the pconnection_done thread (T2) got the R socket 
event, the last event batch, begin_close and free.

The competing thread (T4) got the RW socket event, then failed in numerous ways 
depending on what was in the torn down or reused freed memory.

Catastrophic stuff happens for about 1 in 1 connections for a debug build 
on a middle-aged 4c/8t desktop machine.

> [c] Dispatch/Proton crashes when opening/closing connections
> 
>
> Key: PROTON-1842
> URL: https://issues.apache.org/jira/browse/PROTON-1842
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.22.0
>Reporter: Chuck Rolke
>Priority: Major
> Attachments: helloworld.cpp, race.tsan, race.vg
>
>
> Using proton cpp example code that is modified to open and close connections 
> by the thousands in the main loop and having the event loop short circuit any 
> messaging with:
> {{  void on_connection_open(proton::connection& c) {}}
> {{  c.close();}}
> {{  }}}
> and then directing this client example to a dispatch router 1.1.0. Eventually 
> (after 100,000 to 1,000,000 connection open/closes) the router crashes with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:466: 
> wake_pop_front: Assertion `p->wakes_in_progress' failed.}}
> and with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014: 
> proactor_do_epoll: Assertion `ee->type == PCONNECTION_TIMER' failed.}}
> This issue seems to happen only with qpid-dispatch accepting the open/close 
> event stream. Proton cpp example _server_direct_ and c example _direct_ work 
> properly with the same open/close event stream mounting into the 10s of 
> millions of connections.
> A core dump backtrace with the PCONNECTION_TIMER failure reads as:
> {{(gdb) bt}}
> {{#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51}}
> {{#1  0x7f795c712c41 in __GI_abort () at abort.c:79}}
> {{#2  0x7f795c709f7a in __assert_fail_base (fmt=0x7f795c85a260 
> "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
> assertion=assertion@entry=0x7f795d72e15a "ee->type == PCONNECTION_TIMER", }}
> {{    file=file@entry=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=line@entry=2014, }}
> {{    function=function@entry=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> 
> "proactor_do_epoll") at assert.c:92}}
> {{#3  0x7f795c709ff2 in __GI___assert_fail (assertion=0x7f795d72e15a 
> "ee->type == PCONNECTION_TIMER", file=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=2014, }}
> {{    function=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> "proactor_do_epoll") 
> at assert.c:101}}
> {{#4  0x7f795d72d29f in proactor_do_epoll (p=0x26b7310, can_block=true) 
> at /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014}}
> {{#5  0x7f795d72d30e in pn_proactor_wait (p=0x26b7310) at 
> /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2030}}
> {{#6  0x7f795dbe89ad in thread_run (arg=0x26be750) at 
> /home/chug/git/qpid-dispatch/src/server.c:946}}
> {{#7  0x7f795d50e50b in start_thread (arg=0x7f794f486700) at 
> pthread_create.c:465}}
> {{#8  0x7f795c7d216f in clone () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95}}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-1842) [c] Dispatch/Proton crashes when opening/closing connections

2018-05-08 Thread Alan Conway (JIRA)

[ 
https://issues.apache.org/jira/browse/PROTON-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16467645#comment-16467645
 ] 

Alan Conway commented on PROTON-1842:
-

The threaderciser is showing races in connection close, I'm not sure if they 
are the same issue we are looking at here. Attached output race.vg and 
race.tsan from helgrind and the thread sanitizer. Valigrind detects a *lot* 
more races, probaby because it is slowing things down so much, but the tsan 
stack traces are consistent with valgrind.

> [c] Dispatch/Proton crashes when opening/closing connections
> 
>
> Key: PROTON-1842
> URL: https://issues.apache.org/jira/browse/PROTON-1842
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.22.0
>Reporter: Chuck Rolke
>Priority: Major
> Attachments: helloworld.cpp, race.tsan, race.vg
>
>
> Using proton cpp example code that is modified to open and close connections 
> by the thousands in the main loop and having the event loop short circuit any 
> messaging with:
> {{  void on_connection_open(proton::connection& c) {}}
> {{  c.close();}}
> {{  }}}
> and then directing this client example to a dispatch router 1.1.0. Eventually 
> (after 100,000 to 1,000,000 connection open/closes) the router crashes with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:466: 
> wake_pop_front: Assertion `p->wakes_in_progress' failed.}}
> and with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014: 
> proactor_do_epoll: Assertion `ee->type == PCONNECTION_TIMER' failed.}}
> This issue seems to happen only with qpid-dispatch accepting the open/close 
> event stream. Proton cpp example _server_direct_ and c example _direct_ work 
> properly with the same open/close event stream mounting into the 10s of 
> millions of connections.
> A core dump backtrace with the PCONNECTION_TIMER failure reads as:
> {{(gdb) bt}}
> {{#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51}}
> {{#1  0x7f795c712c41 in __GI_abort () at abort.c:79}}
> {{#2  0x7f795c709f7a in __assert_fail_base (fmt=0x7f795c85a260 
> "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
> assertion=assertion@entry=0x7f795d72e15a "ee->type == PCONNECTION_TIMER", }}
> {{    file=file@entry=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=line@entry=2014, }}
> {{    function=function@entry=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> 
> "proactor_do_epoll") at assert.c:92}}
> {{#3  0x7f795c709ff2 in __GI___assert_fail (assertion=0x7f795d72e15a 
> "ee->type == PCONNECTION_TIMER", file=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=2014, }}
> {{    function=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> "proactor_do_epoll") 
> at assert.c:101}}
> {{#4  0x7f795d72d29f in proactor_do_epoll (p=0x26b7310, can_block=true) 
> at /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014}}
> {{#5  0x7f795d72d30e in pn_proactor_wait (p=0x26b7310) at 
> /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2030}}
> {{#6  0x7f795dbe89ad in thread_run (arg=0x26be750) at 
> /home/chug/git/qpid-dispatch/src/server.c:946}}
> {{#7  0x7f795d50e50b in start_thread (arg=0x7f794f486700) at 
> pthread_create.c:465}}
> {{#8  0x7f795c7d216f in clone () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95}}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-1842) [c] Dispatch/Proton crashes when opening/closing connections

2018-05-08 Thread Cliff Jansen (JIRA)

[ 
https://issues.apache.org/jira/browse/PROTON-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16467486#comment-16467486
 ] 

Cliff Jansen commented on PROTON-1842:
--

This test case is quite devilish.  Thank-you.

The above fix is necessary but the test case exposes more problems.

The epoll callbacks for socket IO are "mushy".  The epoll proactor handles this 
quite well (except at tear down as this JIRA has pointed out).

The proactor regularly flips between "wake me when there is socket data to 
read" and "wake me when I can read OR write".  On any transition, even with 
EPOLLONESHOT, it is not possible to know if one or two threads might be awoken, 
and if two are, which will get the context lock first.

I am seeing the following calls to pconnection process where the first two are 
sequential and the latter two obviously overlap:

 event = RW .. rearm(RW) .. wake self


 inbound_wake .. rearm(R)


 event = R .. segfault on NULL or assert fail on closed fd


 event = RW .. begin close .. cleanup .. self delete

> [c] Dispatch/Proton crashes when opening/closing connections
> 
>
> Key: PROTON-1842
> URL: https://issues.apache.org/jira/browse/PROTON-1842
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.22.0
>Reporter: Chuck Rolke
>Priority: Major
> Attachments: helloworld.cpp
>
>
> Using proton cpp example code that is modified to open and close connections 
> by the thousands in the main loop and having the event loop short circuit any 
> messaging with:
> {{  void on_connection_open(proton::connection& c) {}}
> {{  c.close();}}
> {{  }}}
> and then directing this client example to a dispatch router 1.1.0. Eventually 
> (after 100,000 to 1,000,000 connection open/closes) the router crashes with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:466: 
> wake_pop_front: Assertion `p->wakes_in_progress' failed.}}
> and with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014: 
> proactor_do_epoll: Assertion `ee->type == PCONNECTION_TIMER' failed.}}
> This issue seems to happen only with qpid-dispatch accepting the open/close 
> event stream. Proton cpp example _server_direct_ and c example _direct_ work 
> properly with the same open/close event stream mounting into the 10s of 
> millions of connections.
> A core dump backtrace with the PCONNECTION_TIMER failure reads as:
> {{(gdb) bt}}
> {{#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51}}
> {{#1  0x7f795c712c41 in __GI_abort () at abort.c:79}}
> {{#2  0x7f795c709f7a in __assert_fail_base (fmt=0x7f795c85a260 
> "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
> assertion=assertion@entry=0x7f795d72e15a "ee->type == PCONNECTION_TIMER", }}
> {{    file=file@entry=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=line@entry=2014, }}
> {{    function=function@entry=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> 
> "proactor_do_epoll") at assert.c:92}}
> {{#3  0x7f795c709ff2 in __GI___assert_fail (assertion=0x7f795d72e15a 
> "ee->type == PCONNECTION_TIMER", file=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=2014, }}
> {{    function=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> "proactor_do_epoll") 
> at assert.c:101}}
> {{#4  0x7f795d72d29f in proactor_do_epoll (p=0x26b7310, can_block=true) 
> at /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014}}
> {{#5  0x7f795d72d30e in pn_proactor_wait (p=0x26b7310) at 
> /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2030}}
> {{#6  0x7f795dbe89ad in thread_run (arg=0x26be750) at 
> /home/chug/git/qpid-dispatch/src/server.c:946}}
> {{#7  0x7f795d50e50b in start_thread (arg=0x7f794f486700) at 
> pthread_create.c:465}}
> {{#8  0x7f795c7d216f in clone () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95}}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-1842) [c] Dispatch/Proton crashes when opening/closing connections

2018-05-08 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PROTON-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16466940#comment-16466940
 ] 

ASF subversion and git services commented on PROTON-1842:
-

Commit 0a5c18a1034deaa76deb9367d068e80c2726aa7c in qpid-proton's branch 
refs/heads/master from Clifford Jansen
[ https://git-wip-us.apache.org/repos/asf?p=qpid-proton.git;h=0a5c18a ]

PROTON-1842: timer rearm outside lock can try to access freed memory if another 
thread ends the pconnection ahead of the rearm


> [c] Dispatch/Proton crashes when opening/closing connections
> 
>
> Key: PROTON-1842
> URL: https://issues.apache.org/jira/browse/PROTON-1842
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.22.0
>Reporter: Chuck Rolke
>Priority: Major
> Attachments: helloworld.cpp
>
>
> Using proton cpp example code that is modified to open and close connections 
> by the thousands in the main loop and having the event loop short circuit any 
> messaging with:
> {{  void on_connection_open(proton::connection& c) {}}
> {{  c.close();}}
> {{  }}}
> and then directing this client example to a dispatch router 1.1.0. Eventually 
> (after 100,000 to 1,000,000 connection open/closes) the router crashes with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:466: 
> wake_pop_front: Assertion `p->wakes_in_progress' failed.}}
> and with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014: 
> proactor_do_epoll: Assertion `ee->type == PCONNECTION_TIMER' failed.}}
> This issue seems to happen only with qpid-dispatch accepting the open/close 
> event stream. Proton cpp example _server_direct_ and c example _direct_ work 
> properly with the same open/close event stream mounting into the 10s of 
> millions of connections.
> A core dump backtrace with the PCONNECTION_TIMER failure reads as:
> {{(gdb) bt}}
> {{#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51}}
> {{#1  0x7f795c712c41 in __GI_abort () at abort.c:79}}
> {{#2  0x7f795c709f7a in __assert_fail_base (fmt=0x7f795c85a260 
> "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
> assertion=assertion@entry=0x7f795d72e15a "ee->type == PCONNECTION_TIMER", }}
> {{    file=file@entry=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=line@entry=2014, }}
> {{    function=function@entry=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> 
> "proactor_do_epoll") at assert.c:92}}
> {{#3  0x7f795c709ff2 in __GI___assert_fail (assertion=0x7f795d72e15a 
> "ee->type == PCONNECTION_TIMER", file=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=2014, }}
> {{    function=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> "proactor_do_epoll") 
> at assert.c:101}}
> {{#4  0x7f795d72d29f in proactor_do_epoll (p=0x26b7310, can_block=true) 
> at /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014}}
> {{#5  0x7f795d72d30e in pn_proactor_wait (p=0x26b7310) at 
> /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2030}}
> {{#6  0x7f795dbe89ad in thread_run (arg=0x26be750) at 
> /home/chug/git/qpid-dispatch/src/server.c:946}}
> {{#7  0x7f795d50e50b in start_thread (arg=0x7f794f486700) at 
> pthread_create.c:465}}
> {{#8  0x7f795c7d216f in clone () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95}}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-1842) [c] Dispatch/Proton crashes when opening/closing connections

2018-05-03 Thread Chuck Rolke (JIRA)

[ 
https://issues.apache.org/jira/browse/PROTON-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16462513#comment-16462513
 ] 

Chuck Rolke commented on PROTON-1842:
-

Reproducer setup.
 # Steal proton/cpp/examples/helloworld.cpp and turn it into the connection 
blaster.
 # A vanilla router with a listener on port 5672, no auth. Turn off logging.

Run the helloworld blaster. The router fails with just one instance of the 
blaster but fails faster with multiple instances.

 

> [c] Dispatch/Proton crashes when opening/closing connections
> 
>
> Key: PROTON-1842
> URL: https://issues.apache.org/jira/browse/PROTON-1842
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.22.0
>Reporter: Chuck Rolke
>Priority: Major
> Attachments: helloworld.cpp
>
>
> Using proton cpp example code that is modified to open and close connections 
> by the thousands in the main loop and having the event loop short circuit any 
> messaging with:
> {{  void on_connection_open(proton::connection& c) {}}
> {{  c.close();}}
> {{  }}}
> and then directing this client example to a dispatch router 1.1.0. Eventually 
> (after 100,000 to 1,000,000 connection open/closes) the router crashes with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:466: 
> wake_pop_front: Assertion `p->wakes_in_progress' failed.}}
> and with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014: 
> proactor_do_epoll: Assertion `ee->type == PCONNECTION_TIMER' failed.}}
> This issue seems to happen only with qpid-dispatch accepting the open/close 
> event stream. Proton cpp example _server_direct_ and c example _direct_ work 
> properly with the same open/close event stream mounting into the 10s of 
> millions of connections.
> A core dump backtrace with the PCONNECTION_TIMER failure reads as:
> {{(gdb) bt}}
> {{#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51}}
> {{#1  0x7f795c712c41 in __GI_abort () at abort.c:79}}
> {{#2  0x7f795c709f7a in __assert_fail_base (fmt=0x7f795c85a260 
> "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
> assertion=assertion@entry=0x7f795d72e15a "ee->type == PCONNECTION_TIMER", }}
> {{    file=file@entry=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=line@entry=2014, }}
> {{    function=function@entry=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> 
> "proactor_do_epoll") at assert.c:92}}
> {{#3  0x7f795c709ff2 in __GI___assert_fail (assertion=0x7f795d72e15a 
> "ee->type == PCONNECTION_TIMER", file=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=2014, }}
> {{    function=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> "proactor_do_epoll") 
> at assert.c:101}}
> {{#4  0x7f795d72d29f in proactor_do_epoll (p=0x26b7310, can_block=true) 
> at /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014}}
> {{#5  0x7f795d72d30e in pn_proactor_wait (p=0x26b7310) at 
> /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2030}}
> {{#6  0x7f795dbe89ad in thread_run (arg=0x26be750) at 
> /home/chug/git/qpid-dispatch/src/server.c:946}}
> {{#7  0x7f795d50e50b in start_thread (arg=0x7f794f486700) at 
> pthread_create.c:465}}
> {{#8  0x7f795c7d216f in clone () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95}}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org