[jira] [Reopened] (QPIDJMS-376) notify the ExceptionListner when a consumer with a MessageListener remotely closes

2018-05-08 Thread Robbie Gemmell (JIRA)

 [ 
https://issues.apache.org/jira/browse/QPIDJMS-376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robbie Gemmell reopened QPIDJMS-376:


> notify the ExceptionListner when a consumer with a MessageListener remotely 
> closes
> --
>
> Key: QPIDJMS-376
> URL: https://issues.apache.org/jira/browse/QPIDJMS-376
> Project: Qpid JMS
>  Issue Type: Bug
>  Components: qpid-jms-client
>Affects Versions: 0.31.0
> Environment: AMQP Server: Enmasse 0.17.1
> Enmasse Address Type: anycast
>Reporter: Daniel Maier
>Priority: Major
> Fix For: 0.32.0
>
> Attachments: clientlogs.txt
>
>
> When I create a consumer to an address that just does not exist, I expected 
> to get some exception or that the client retries the operation. But there 
> seems not even to be a log message which indicates a failure.
> Is this intended behavior or is this a bug? A more general description is: If 
> AMQP server closes the receiver link, qpid jms client does not notify the 
> user anyhow or does not re-establish the link.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Closed] (QPIDJMS-376) notify the ExceptionListener when a consumer with a MessageListener remotely closes

2018-05-08 Thread Robbie Gemmell (JIRA)

 [ 
https://issues.apache.org/jira/browse/QPIDJMS-376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robbie Gemmell closed QPIDJMS-376.
--
Resolution: Fixed

> notify the ExceptionListener when a consumer with a MessageListener remotely 
> closes
> ---
>
> Key: QPIDJMS-376
> URL: https://issues.apache.org/jira/browse/QPIDJMS-376
> Project: Qpid JMS
>  Issue Type: Bug
>  Components: qpid-jms-client
>Affects Versions: 0.31.0
> Environment: AMQP Server: Enmasse 0.17.1
> Enmasse Address Type: anycast
>Reporter: Daniel Maier
>Priority: Major
> Fix For: 0.32.0
>
> Attachments: clientlogs.txt
>
>
> When I create a consumer to an address that just does not exist, I expected 
> to get some exception or that the client retries the operation. But there 
> seems not even to be a log message which indicates a failure.
> Is this intended behavior or is this a bug? A more general description is: If 
> AMQP server closes the receiver link, qpid jms client does not notify the 
> user anyhow or does not re-establish the link.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Updated] (QPIDJMS-376) notify the ExceptionListener when a consumer with a MessageListener remotely closes

2018-05-08 Thread Robbie Gemmell (JIRA)

 [ 
https://issues.apache.org/jira/browse/QPIDJMS-376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robbie Gemmell updated QPIDJMS-376:
---
Summary: notify the ExceptionListener when a consumer with a 
MessageListener remotely closes  (was: notify the ExceptionListner when a 
consumer with a MessageListener remotely closes)

> notify the ExceptionListener when a consumer with a MessageListener remotely 
> closes
> ---
>
> Key: QPIDJMS-376
> URL: https://issues.apache.org/jira/browse/QPIDJMS-376
> Project: Qpid JMS
>  Issue Type: Bug
>  Components: qpid-jms-client
>Affects Versions: 0.31.0
> Environment: AMQP Server: Enmasse 0.17.1
> Enmasse Address Type: anycast
>Reporter: Daniel Maier
>Priority: Major
> Fix For: 0.32.0
>
> Attachments: clientlogs.txt
>
>
> When I create a consumer to an address that just does not exist, I expected 
> to get some exception or that the client retries the operation. But there 
> seems not even to be a log message which indicates a failure.
> Is this intended behavior or is this a bug? A more general description is: If 
> AMQP server closes the receiver link, qpid jms client does not notify the 
> user anyhow or does not re-establish the link.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Closed] (QPIDJMS-379) Reduce garbage created on input from transport

2018-05-08 Thread Robbie Gemmell (JIRA)

 [ 
https://issues.apache.org/jira/browse/QPIDJMS-379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robbie Gemmell closed QPIDJMS-379.
--
Resolution: Fixed

> Reduce garbage created on input from transport
> --
>
> Key: QPIDJMS-379
> URL: https://issues.apache.org/jira/browse/QPIDJMS-379
> Project: Qpid JMS
>  Issue Type: Improvement
>  Components: qpid-jms-client
>Affects Versions: 0.31.0
>Reporter: Timothy Bish
>Assignee: Timothy Bish
>Priority: Major
> Fix For: 0.32.0
>
>
> The input processor can be simplified to reduce temporary objects create on 
> incoming bytes processing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Reopened] (QPIDJMS-379) Reduce garbage create on input from transport

2018-05-08 Thread Robbie Gemmell (JIRA)

 [ 
https://issues.apache.org/jira/browse/QPIDJMS-379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robbie Gemmell reopened QPIDJMS-379:


> Reduce garbage create on input from transport
> -
>
> Key: QPIDJMS-379
> URL: https://issues.apache.org/jira/browse/QPIDJMS-379
> Project: Qpid JMS
>  Issue Type: Improvement
>  Components: qpid-jms-client
>Affects Versions: 0.31.0
>Reporter: Timothy Bish
>Assignee: Timothy Bish
>Priority: Major
> Fix For: 0.32.0
>
>
> The input processor can be simplified to reduce temporary objects create on 
> incoming bytes processing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Updated] (QPIDJMS-379) Reduce garbage created on input from transport

2018-05-08 Thread Robbie Gemmell (JIRA)

 [ 
https://issues.apache.org/jira/browse/QPIDJMS-379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robbie Gemmell updated QPIDJMS-379:
---
Summary: Reduce garbage created on input from transport  (was: Reduce 
garbage create on input from transport)

> Reduce garbage created on input from transport
> --
>
> Key: QPIDJMS-379
> URL: https://issues.apache.org/jira/browse/QPIDJMS-379
> Project: Qpid JMS
>  Issue Type: Improvement
>  Components: qpid-jms-client
>Affects Versions: 0.31.0
>Reporter: Timothy Bish
>Assignee: Timothy Bish
>Priority: Major
> Fix For: 0.32.0
>
>
> The input processor can be simplified to reduce temporary objects create on 
> incoming bytes processing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[ANNOUNCE] Apache Qpid JMS 0.32.0 released

2018-05-08 Thread Robbie Gemmell
The Apache Qpid (http://qpid.apache.org) community is pleased to
announce the immediate availability of Apache Qpid JMS 0.32.0.

This is the latest release of our newer JMS client supporting the
Advanced Message Queuing Protocol 1.0 (AMQP 1.0, ISO/IEC 19464,
http://www.amqp.org), based around the Apache Qpid Proton protocol
engine and implementing the AMQP JMS Mapping as it evolves at OASIS.

The release is available now from our website:
http://qpid.apache.org/download.html

Binaries are also available via Maven Central:
http://qpid.apache.org/maven.html

Release notes can be found at:
http://qpid.apache.org/releases/qpid-jms-0.32.0/release-notes.html

Thanks to all involved,
Robbie

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (DISPATCH-989) symlinks in tree to non existent files, possibly stale and could be removed?

2018-05-08 Thread Robbie Gemmell (JIRA)
Robbie Gemmell created DISPATCH-989:
---

 Summary: symlinks in tree to non existent files, possibly stale 
and could be removed?
 Key: DISPATCH-989
 URL: https://issues.apache.org/jira/browse/DISPATCH-989
 Project: Qpid Dispatch
  Issue Type: Task
  Components: Console
Affects Versions: 1.1.0
Reporter: Robbie Gemmell
Assignee: Ernest Allen
 Fix For: 1.2.0


There are a number of symlinks in the console tree which point to files that no 
longer exist. Its not clear these are actually required anymore, and may be 
stale and could be removed? The targets were seemingly removed by DISPATCH-917

Dir console/test/css/:
brokers.ttf -> ../../stand-alone/plugin/css/brokers.ttf
dispatch.css -> ../../stand-alone/plugin/css/dispatch.css
plugin.css -> ../../stand-alone/plugin/css/plugin.css
site-base.css -> ../../stand-alone/plugin/css/site-base.css

Dir console/test/html/:
qdrConnect.html -> ../../stand-alone/plugin/html/qdrConnect.html
qdrLayout.html -> ../../stand-alone/plugin/html/qdrLayout.html

Dir console/test/js/:
qdrService.js -> ../../stand-alone/plugin/js/qdrService.js

Dir console/test/lib/:
rhea-min.js -> ../../stand-alone/plugin/lib/rhea-min.js



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Resolved] (DISPATCH-988) Documentation of policy default vhost is wrong

2018-05-08 Thread Chuck Rolke (JIRA)

 [ 
https://issues.apache.org/jira/browse/DISPATCH-988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chuck Rolke resolved DISPATCH-988.
--
   Resolution: Fixed
Fix Version/s: 1.2.0

Fixed at Commit 945cac6fd

> Documentation of policy default vhost is wrong
> --
>
> Key: DISPATCH-988
> URL: https://issues.apache.org/jira/browse/DISPATCH-988
> Project: Qpid Dispatch
>  Issue Type: Bug
>Affects Versions: 1.0.1
>Reporter: Chuck Rolke
>Assignee: Chuck Rolke
>Priority: Major
> Fix For: 1.2.0
>
>
> The policy defaultVhost property is described incorrectly. it is enabled by 
> default and set to the vhost name _$default_. Default vhost processing is 
> disabled when 1) the defaultVhost property is set to blank or 2) when there 
> is no vhost whose hostname matches the defaultVhost setting.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[GitHub] qpid-dispatch pull request #301: DISPATCH-927 - System test for fix. Makes s...

2018-05-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/qpid-dispatch/pull/301


---

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (DISPATCH-927) detach not echoed back on multi-hop link route

2018-05-08 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/DISPATCH-927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16467416#comment-16467416
 ] 

ASF subversion and git services commented on DISPATCH-927:
--

Commit 7e4dfd7334ea994719e178cba78998c1933f60dc in qpid-dispatch's branch 
refs/heads/master from [~fgiorget]
[ https://git-wip-us.apache.org/repos/asf?p=qpid-dispatch.git;h=7e4dfd7 ]

DISPATCH-927 - System test for fix. Makes sure both detaches are echoed back


> detach not echoed back on multi-hop link route
> --
>
> Key: DISPATCH-927
> URL: https://issues.apache.org/jira/browse/DISPATCH-927
> Project: Qpid Dispatch
>  Issue Type: Bug
>  Components: Container
>Affects Versions: 1.0.0
>Reporter: Gordon Sim
>Assignee: Ganesh Murthy
>Priority: Major
> Fix For: 1.1.0
>
> Attachments: DISPATCH-927.patch, broker.xml, simple-topic-a.conf, 
> simple-topic-b.conf, simple_recv_modified.py
>
>
> In a two router network, router-a and router-b, a link route is defined in 
> both directions on both routers. There is also an associated connector to a 
> broker on router-b. The address is configured to be a topic on the broker.
> If two receivers attach on this address to router-a, and then detach at the 
> same time having received the defined number of messages, frequently (but not 
> always), one of the receivers will not get a detach echoed back to it.
> On inspection of protocol traces, it appears that router-b, though it gets 
> the detach echoed back from the broker, does not forward this back to the 
> client (via router-a).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (DISPATCH-927) detach not echoed back on multi-hop link route

2018-05-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DISPATCH-927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16467418#comment-16467418
 ] 

ASF GitHub Bot commented on DISPATCH-927:
-

Github user asfgit closed the pull request at:

https://github.com/apache/qpid-dispatch/pull/301


> detach not echoed back on multi-hop link route
> --
>
> Key: DISPATCH-927
> URL: https://issues.apache.org/jira/browse/DISPATCH-927
> Project: Qpid Dispatch
>  Issue Type: Bug
>  Components: Container
>Affects Versions: 1.0.0
>Reporter: Gordon Sim
>Assignee: Ganesh Murthy
>Priority: Major
> Fix For: 1.1.0
>
> Attachments: DISPATCH-927.patch, broker.xml, simple-topic-a.conf, 
> simple-topic-b.conf, simple_recv_modified.py
>
>
> In a two router network, router-a and router-b, a link route is defined in 
> both directions on both routers. There is also an associated connector to a 
> broker on router-b. The address is configured to be a topic on the broker.
> If two receivers attach on this address to router-a, and then detach at the 
> same time having received the defined number of messages, frequently (but not 
> always), one of the receivers will not get a detach echoed back to it.
> On inspection of protocol traces, it appears that router-b, though it gets 
> the detach echoed back from the broker, does not forward this back to the 
> client (via router-a).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (DISPATCH-990) Use patterns for policy vhost hostnames

2018-05-08 Thread Chuck Rolke (JIRA)
Chuck Rolke created DISPATCH-990:


 Summary: Use patterns for policy vhost hostnames
 Key: DISPATCH-990
 URL: https://issues.apache.org/jira/browse/DISPATCH-990
 Project: Qpid Dispatch
  Issue Type: Bug
Reporter: Chuck Rolke


Currently policy vhost hostnames identify a single host. Vhost policy would be 
much more flexible if the hostnames could be specified with pattern matching 
wildcards:

{{  #.corporate.example.com}}
{{  #.labs.example.com}}
{{  *.users.example.com}}
{{  #.example.com}}

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-1842) [c] Dispatch/Proton crashes when opening/closing connections

2018-05-08 Thread Cliff Jansen (JIRA)

[ 
https://issues.apache.org/jira/browse/PROTON-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16467486#comment-16467486
 ] 

Cliff Jansen commented on PROTON-1842:
--

This test case is quite devilish.  Thank-you.

The above fix is necessary but the test case exposes more problems.

The epoll callbacks for socket IO are "mushy".  The epoll proactor handles this 
quite well (except at tear down as this JIRA has pointed out).

The proactor regularly flips between "wake me when there is socket data to 
read" and "wake me when I can read OR write".  On any transition, even with 
EPOLLONESHOT, it is not possible to know if one or two threads might be awoken, 
and if two are, which will get the context lock first.

I am seeing the following calls to pconnection process where the first two are 
sequential and the latter two obviously overlap:

 event = RW .. rearm(RW) .. wake self


 inbound_wake .. rearm(R)


 event = R .. segfault on NULL or assert fail on closed fd


 event = RW .. begin close .. cleanup .. self delete

> [c] Dispatch/Proton crashes when opening/closing connections
> 
>
> Key: PROTON-1842
> URL: https://issues.apache.org/jira/browse/PROTON-1842
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.22.0
>Reporter: Chuck Rolke
>Priority: Major
> Attachments: helloworld.cpp
>
>
> Using proton cpp example code that is modified to open and close connections 
> by the thousands in the main loop and having the event loop short circuit any 
> messaging with:
> {{  void on_connection_open(proton::connection& c) {}}
> {{  c.close();}}
> {{  }}}
> and then directing this client example to a dispatch router 1.1.0. Eventually 
> (after 100,000 to 1,000,000 connection open/closes) the router crashes with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:466: 
> wake_pop_front: Assertion `p->wakes_in_progress' failed.}}
> and with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014: 
> proactor_do_epoll: Assertion `ee->type == PCONNECTION_TIMER' failed.}}
> This issue seems to happen only with qpid-dispatch accepting the open/close 
> event stream. Proton cpp example _server_direct_ and c example _direct_ work 
> properly with the same open/close event stream mounting into the 10s of 
> millions of connections.
> A core dump backtrace with the PCONNECTION_TIMER failure reads as:
> {{(gdb) bt}}
> {{#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51}}
> {{#1  0x7f795c712c41 in __GI_abort () at abort.c:79}}
> {{#2  0x7f795c709f7a in __assert_fail_base (fmt=0x7f795c85a260 
> "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
> assertion=assertion@entry=0x7f795d72e15a "ee->type == PCONNECTION_TIMER", }}
> {{    file=file@entry=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=line@entry=2014, }}
> {{    function=function@entry=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> 
> "proactor_do_epoll") at assert.c:92}}
> {{#3  0x7f795c709ff2 in __GI___assert_fail (assertion=0x7f795d72e15a 
> "ee->type == PCONNECTION_TIMER", file=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=2014, }}
> {{    function=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> "proactor_do_epoll") 
> at assert.c:101}}
> {{#4  0x7f795d72d29f in proactor_do_epoll (p=0x26b7310, can_block=true) 
> at /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014}}
> {{#5  0x7f795d72d30e in pn_proactor_wait (p=0x26b7310) at 
> /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2030}}
> {{#6  0x7f795dbe89ad in thread_run (arg=0x26be750) at 
> /home/chug/git/qpid-dispatch/src/server.c:946}}
> {{#7  0x7f795d50e50b in start_thread (arg=0x7f794f486700) at 
> pthread_create.c:465}}
> {{#8  0x7f795c7d216f in clone () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95}}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (QPID-8185) [JMS AMQP 0-x][AMQP 0-8..0-91] Make sure that client closes TCP connection on failure with sending connection.close

2018-05-08 Thread Alex Rudyy (JIRA)
Alex Rudyy created QPID-8185:


 Summary: [JMS AMQP 0-x][AMQP 0-8..0-91] Make sure that client 
closes TCP connection on failure with sending connection.close
 Key: QPID-8185
 URL: https://issues.apache.org/jira/browse/QPID-8185
 Project: Qpid
  Issue Type: Improvement
  Components: JMS AMQP 0-x
Affects Versions: qpid-java-client-0-x-6.3.0, qpid-java-6.0.8, 0.32, 0.30, 
0.28, 0.26, 0.24, 0.22, 0.20, 0.18, qpid-java-6.1.6
Reporter: Alex Rudyy
 Fix For: qpid-java-client-0-x-6.3.1


Sending connection.close as part of {{Connection#close}} can end-up in timeout 
exception. The underlying TCP connection remains open and Broker can continue 
sending data to the client when session close ends up in timeout as well. The 
incoming frames cannot be associated with the sessions, as the client removes 
session information on connection close, which results in a number of confusing 
exceptions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Updated] (QPID-8185) [JMS AMQP 0-x][AMQP 0-8..0-91] Make sure that client closes TCP connection on failure with sending connection.close

2018-05-08 Thread Alex Rudyy (JIRA)

 [ 
https://issues.apache.org/jira/browse/QPID-8185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rudyy updated QPID-8185:
-
Description: Sending connection.close as part of {{Connection#close}} can 
end-up in timeout exception. The underlying TCP connection remains open and 
Broker can continue sending data to the client when session close ends up in 
timeout as well. The incoming frames cannot be associated with the sessions, as 
the client removes session information on connection close. As result in a 
number of confusing exceptions is reported  (was: Sending connection.close as 
part of {{Connection#close}} can end-up in timeout exception. The underlying 
TCP connection remains open and Broker can continue sending data to the client 
when session close ends up in timeout as well. The incoming frames cannot be 
associated with the sessions, as the client removes session information on 
connection close, which results in a number of confusing exceptions.)

> [JMS AMQP 0-x][AMQP 0-8..0-91] Make sure that client closes TCP connection on 
> failure with sending connection.close
> ---
>
> Key: QPID-8185
> URL: https://issues.apache.org/jira/browse/QPID-8185
> Project: Qpid
>  Issue Type: Improvement
>  Components: JMS AMQP 0-x
>Affects Versions: qpid-java-6.1.6, 0.18, 0.20, 0.22, 0.24, 0.26, 0.28, 
> 0.30, 0.32, qpid-java-6.0.8, qpid-java-client-0-x-6.3.0
>Reporter: Alex Rudyy
>Priority: Major
> Fix For: qpid-java-client-0-x-6.3.1
>
>
> Sending connection.close as part of {{Connection#close}} can end-up in 
> timeout exception. The underlying TCP connection remains open and Broker can 
> continue sending data to the client when session close ends up in timeout as 
> well. The incoming frames cannot be associated with the sessions, as the 
> client removes session information on connection close. As result in a number 
> of confusing exceptions is reported



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Updated] (QPID-8185) [JMS AMQP 0-x][AMQP 0-8..0-91] Make sure that client closes TCP connection on failure with sending connection.close

2018-05-08 Thread Alex Rudyy (JIRA)

 [ 
https://issues.apache.org/jira/browse/QPID-8185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rudyy updated QPID-8185:
-
Description: Sending connection.close as part of {{Connection#close}} can 
end-up in timeout exception. The underlying TCP connection remains open and 
Broker can continue sending data to the client when session close ends up in 
timeout as well. The incoming frames cannot be associated with the sessions, as 
the client removes session information on connection close. As result, a number 
of confusing exceptions is reported  (was: Sending connection.close as part of 
{{Connection#close}} can end-up in timeout exception. The underlying TCP 
connection remains open and Broker can continue sending data to the client when 
session close ends up in timeout as well. The incoming frames cannot be 
associated with the sessions, as the client removes session information on 
connection close. As result in a number of confusing exceptions is reported)

> [JMS AMQP 0-x][AMQP 0-8..0-91] Make sure that client closes TCP connection on 
> failure with sending connection.close
> ---
>
> Key: QPID-8185
> URL: https://issues.apache.org/jira/browse/QPID-8185
> Project: Qpid
>  Issue Type: Improvement
>  Components: JMS AMQP 0-x
>Affects Versions: qpid-java-6.1.6, 0.18, 0.20, 0.22, 0.24, 0.26, 0.28, 
> 0.30, 0.32, qpid-java-6.0.8, qpid-java-client-0-x-6.3.0
>Reporter: Alex Rudyy
>Priority: Major
> Fix For: qpid-java-client-0-x-6.3.1
>
>
> Sending connection.close as part of {{Connection#close}} can end-up in 
> timeout exception. The underlying TCP connection remains open and Broker can 
> continue sending data to the client when session close ends up in timeout as 
> well. The incoming frames cannot be associated with the sessions, as the 
> client removes session information on connection close. As result, a number 
> of confusing exceptions is reported



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Updated] (QPID-8185) [JMS AMQP 0-x][AMQP 0-8..0-91] Make sure that client closes TCP connection on failure with sending connection.close

2018-05-08 Thread Alex Rudyy (JIRA)

 [ 
https://issues.apache.org/jira/browse/QPID-8185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rudyy updated QPID-8185:
-
Attachment: 0001-JMS-AMQP-0-x-AMQP-0-8.0-91-Make-sure-that-client-clo.patch

> [JMS AMQP 0-x][AMQP 0-8..0-91] Make sure that client closes TCP connection on 
> failure with sending connection.close
> ---
>
> Key: QPID-8185
> URL: https://issues.apache.org/jira/browse/QPID-8185
> Project: Qpid
>  Issue Type: Improvement
>  Components: JMS AMQP 0-x
>Affects Versions: qpid-java-6.1.6, 0.18, 0.20, 0.22, 0.24, 0.26, 0.28, 
> 0.30, 0.32, qpid-java-6.0.8, qpid-java-client-0-x-6.3.0
>Reporter: Alex Rudyy
>Priority: Major
> Fix For: qpid-java-client-0-x-6.3.1
>
> Attachments: 
> 0001-JMS-AMQP-0-x-AMQP-0-8.0-91-Make-sure-that-client-clo.patch
>
>
> Sending connection.close as part of {{Connection#close}} can end-up in 
> timeout exception. The underlying TCP connection remains open and Broker can 
> continue sending data to the client when session close ends up in timeout as 
> well. The incoming frames cannot be associated with the sessions, as the 
> client removes session information on connection close. As result, a number 
> of confusing exceptions is reported



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Comment Edited] (QPID-8184) [linearstore] Recovery intermittently produces JERR_EFP_BADEFPDIRNAME error followed by core

2018-05-08 Thread Kim van der Riet (JIRA)

[ 
https://issues.apache.org/jira/browse/QPID-8184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16466468#comment-16466468
 ] 

Kim van der Riet edited comment on QPID-8184 at 5/8/18 3:41 PM:


Pavel Moravec has discovered the root cause of this issue, see 
[https://bugzilla.redhat.com/show_bug.cgi?id=1561819#c18]. It appears that when 
using ::readlink(), the string containing the link destination is copied into 
the supplied buffer, but without being terminated with a '\0'. In some cases, 
there is remaining data in the buffer which when searched from the rear of the 
string yields odd results.

The issue appears to be solved by simply terminating the string in the buffer 
with a '\0'.


was (Author: kpvdr):
Pavel Moravec has discovered the root cause of this issue, see 
[https://bugzilla.redhat.com/show_bug.cgi?id=1561819#c18.] It appears that when 
using ::readlink(), the string containing the link destination is copied into 
the supplied buffer, but without being terminated with a '\0'. In some cases, 
there is remaining data in the buffer which when searched from the rear of the 
string yields odd results.

The issue appears to be solved by simply terminating the string in the buffer 
with a '\0'.

> [linearstore] Recovery intermittently produces JERR_EFP_BADEFPDIRNAME error 
> followed by core
> 
>
> Key: QPID-8184
> URL: https://issues.apache.org/jira/browse/QPID-8184
> Project: Qpid
>  Issue Type: Bug
>  Components: C++ Broker
>Reporter: Kim van der Riet
>Assignee: Kim van der Riet
>Priority: Major
>
> Some users are experiencing difficulty recovering the store, especially when 
> there are a large  number of queues (several thousand). The log files show 
> the following pattern:
> {{JERR_EFP_BADEFPDIRNAME}} in which some arbitrary number which is not 
> divisible by 4 is being used as the EFP file size (called EFP directory in 
> the log), followed by a segfault:
> {noformat}
> May 4 18:55:00 prodrhs1l qpidd[6240]: 2018-05-04 18:55:00 [Store] warning 
> Linear Store: EmptyFilePool create failed: jexception 0x0d03 
> EmptyFilePool::fileSizeKbFromDirName() threw JERR_EFP_BADEFPDIRNAME: Bad 
> Empty File Pool directory name (must be 'NNNk', where NNN is a number which 
> is a multiple of 4) (Partition: 1; EFP directory: '9k')
> May 4 18:55:00 prodrhs1l kernel: qpidd[6240]: segfault at 10 ip 
> 7f4219af8e19 sp 7ffc227a6350 error 4 in 
> linearstore.so[7f4219ac4000+bd000]{noformat}
>  In the event that the random number _is_ divisible by 4, a randomly sized 
> directory containing no files may appear in the partition EFP.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Updated] (QPID-8185) [JMS AMQP 0-x][AMQP 0-8..0-91] Make sure that client closes TCP connection on failure with sending connection.close

2018-05-08 Thread Alex Rudyy (JIRA)

 [ 
https://issues.apache.org/jira/browse/QPID-8185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rudyy updated QPID-8185:
-
Description: 
Sending connection.close as part of {{Connection#close}} can end-up in timeout 
exception. The underlying TCP connection remains open and Broker can continue 
sending data to the client when session close ends up in timeout as well. The 
incoming frames cannot be associated with the sessions, as the client removes 
session information on connection close. As result, a number of confusing 
exceptions is reported.

Here are the examples of exception stack-traces reported for the issue
{noformat}
INFO  Unsuspending channel threw an exception:  
  [Thread-227][AMQSession.java:2374]
org.apache.qpid.AMQTimeoutException: Server did not respond in a timely fashion
at 
org.apache.qpid.client.util.BlockingWaiter.block(BlockingWaiter.java:170) 
~[qpid-client-0.32.jar:0.32]
at 
org.apache.qpid.client.protocol.BlockingMethodFrameListener.blockForFrame(BlockingMethodFrameListener.java:115)
 ~[qpid-client-0.32.jar:0.32]
at 
org.apache.qpid.client.protocol.AMQProtocolHandler.writeCommandFrameAndWaitForReply(AMQProtocolHandler.java:715)
 ~[qpid-client-0.32.jar:0.32]
at 
org.apache.qpid.client.protocol.AMQProtocolHandler.syncWrite(AMQProtocolHandler.java:736)
 ~[qpid-client-0.32.jar:0.32]
at 
org.apache.qpid.client.protocol.AMQProtocolHandler.syncWrite(AMQProtocolHandler.java:730)
 ~[qpid-client-0.32.jar:0.32]
at 
org.apache.qpid.client.AMQSession_0_8.sendSuspendChannel(AMQSession_0_8.java:728)
 ~[qpid-client-0.32.jar:0.32]
at 
org.apache.qpid.client.AMQSession.suspendChannel(AMQSession.java:3156) 
[qpid-client-0.32.jar:0.32]
at 
org.apache.qpid.client.AMQSession.startDispatcherIfNecessary(AMQSession.java:2370)
 [qpid-client-0.32.jar:0.32]
at 
org.apache.qpid.client.AMQSession.syncDispatchQueue(AMQSession.java:2223) 
[qpid-client-0.32.jar:0.32]
at org.apache.qpid.client.AMQSession.rollback(AMQSession.java:1881) 
[qpid-client-0.32.jar:0.32]

ERROR Error closing session: javax.jms.JMSException: Error closing session: 
org.apache.qpid.AMQTimeoutException: Server did not respond in a timely fashion 
[error code 408: Request 
Timeout][DefaultMessageListenerContainer-2][AMQConnection.java:1039]
ERROR Error closing connection  
  
[DefaultMessageListenerContainer-2][AMQConnection.java:971]
javax.jms.JMSException: Error closing session: 
org.apache.qpid.AMQTimeoutException: Server did not respond in a timely fashion 
[error code 408: Request Timeout]
at org.apache.qpid.client.AMQSession.close(AMQSession.java:764) 
~[qpid-client-0.32.jar:0.32]
at org.apache.qpid.client.AMQSession.close(AMQSession.java:730) 
~[qpid-client-0.32.jar:0.32]
at 
org.apache.qpid.client.AMQConnection.closeAllSessions(AMQConnection.java:1035) 
[qpid-client-0.32.jar:0.32]
at org.apache.qpid.client.AMQConnection.doClose(AMQConnection.java:962) 
[qpid-client-0.32.jar:0.32]
at org.apache.qpid.client.AMQConnection.doClose(AMQConnection.java:951) 
[qpid-client-0.32.jar:0.32]
at org.apache.qpid.client.AMQConnection.doClose(AMQConnection.java:951) 
[qpid-client-0.32.jar:0.32]
at org.apache.qpid.client.AMQConnection.close(AMQConnection.java:935) 
[qpid-client-0.32.jar:0.32]
at org.apache.qpid.client.AMQConnection.close(AMQConnection.java:916) 
[qpid-client-0.32.jar:0.32]
at 
org.springframework.jms.connection.ConnectionFactoryUtils.releaseConnection(ConnectionFactoryUtils.java:80)
 [spring-jms-4.2.3.RELEASE.jar:4.2.3.RELEASE]
at 
org.springframework.jms.listener.AbstractJmsListeningContainer.refreshSharedConnection(AbstractJmsListeningContainer.java:395)
 [spring-jms-4.2.3.RELEASE.jar:4.2.3.RELEASE]
at 
org.springframework.jms.listener.DefaultMessageListenerContainer.refreshConnectionUntilSuccessful(DefaultMessageListenerContainer.java:915)
 [spring-jms-4.2.3.RELEASE.jar:4.2.3.RELEASE]
at 
org.springframework.jms.listener.DefaultMessageListenerContainer.recoverAfterListenerSetupFailure(DefaultMessageListenerContainer.java:890)
 [spring-jms-4.2.3.RELEASE.jar:4.2.3.RELEASE]
at 
org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.run(DefaultMessageListenerContainer.java:1061)
 [spring-jms-4.2.3.RELEASE.jar:4.2.3.RELEASE]
at java.lang.Thread.run(Thread.java:724) [na:1.7.0_40]
Caused by: org.apache.qpid.AMQTimeoutException: Server did not respond in a 
timely fashion
at 
org.apache.qpid.client.util.BlockingWaiter.block(BlockingWaiter.java:170) 
~[qpid-client-0.32.jar:0.32]
at 
org.apache.qpid.client.protocol.BlockingMethodFrameListener.blockForFrame(BlockingMethodFrameListener.java:115)
 ~[qpid-client-0.32.ja

[jira] [Updated] (QPID-8185) [JMS AMQP 0-x][AMQP 0-8..0-91] Make sure that client closes TCP connection on failure with sending connection.close

2018-05-08 Thread Alex Rudyy (JIRA)

 [ 
https://issues.apache.org/jira/browse/QPID-8185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rudyy updated QPID-8185:
-
Description: 
Sending connection.close as part of {{Connection#close}} can end-up in timeout 
exception. The underlying TCP connection remains open and Broker can continue 
sending data to the client when session close ends up in timeout as well. The 
incoming frames cannot be associated with the sessions, as the client removes 
session information on connection close. As result, a number of confusing 
exceptions is reported.

Here are the examples of exception stack-traces reported for the issue
{noformat}
INFO  Unsuspending channel threw an exception:  
  [Thread-227][AMQSession.java:2374]
org.apache.qpid.AMQTimeoutException: Server did not respond in a timely fashion
at 
org.apache.qpid.client.util.BlockingWaiter.block(BlockingWaiter.java:170) 
~[qpid-client-0.32.jar:0.32]
at 
org.apache.qpid.client.protocol.BlockingMethodFrameListener.blockForFrame(BlockingMethodFrameListener.java:115)
 ~[qpid-client-0.32.jar:0.32]
at 
org.apache.qpid.client.protocol.AMQProtocolHandler.writeCommandFrameAndWaitForReply(AMQProtocolHandler.java:715)
 ~[qpid-client-0.32.jar:0.32]
at 
org.apache.qpid.client.protocol.AMQProtocolHandler.syncWrite(AMQProtocolHandler.java:736)
 ~[qpid-client-0.32.jar:0.32]
at 
org.apache.qpid.client.protocol.AMQProtocolHandler.syncWrite(AMQProtocolHandler.java:730)
 ~[qpid-client-0.32.jar:0.32]
at 
org.apache.qpid.client.AMQSession_0_8.sendSuspendChannel(AMQSession_0_8.java:728)
 ~[qpid-client-0.32.jar:0.32]
at 
org.apache.qpid.client.AMQSession.suspendChannel(AMQSession.java:3156) 
[qpid-client-0.32.jar:0.32]
at 
org.apache.qpid.client.AMQSession.startDispatcherIfNecessary(AMQSession.java:2370)
 [qpid-client-0.32.jar:0.32]
at 
org.apache.qpid.client.AMQSession.syncDispatchQueue(AMQSession.java:2223) 
[qpid-client-0.32.jar:0.32]
at org.apache.qpid.client.AMQSession.rollback(AMQSession.java:1881) 
[qpid-client-0.32.jar:0.32]

ERROR Error closing session: javax.jms.JMSException: Error closing session: 
org.apache.qpid.AMQTimeoutException: Server did not respond in a timely fashion 
[error code 408: Request 
Timeout][DefaultMessageListenerContainer-2][AMQConnection.java:1039]
ERROR Error closing connection  
  
[DefaultMessageListenerContainer-2][AMQConnection.java:971]
javax.jms.JMSException: Error closing session: 
org.apache.qpid.AMQTimeoutException: Server did not respond in a timely fashion 
[error code 408: Request Timeout]
at org.apache.qpid.client.AMQSession.close(AMQSession.java:764) 
~[qpid-client-0.32.jar:0.32]
at org.apache.qpid.client.AMQSession.close(AMQSession.java:730) 
~[qpid-client-0.32.jar:0.32]
at 
org.apache.qpid.client.AMQConnection.closeAllSessions(AMQConnection.java:1035) 
[qpid-client-0.32.jar:0.32]
at org.apache.qpid.client.AMQConnection.doClose(AMQConnection.java:962) 
[qpid-client-0.32.jar:0.32]
at org.apache.qpid.client.AMQConnection.doClose(AMQConnection.java:951) 
[qpid-client-0.32.jar:0.32]
at org.apache.qpid.client.AMQConnection.doClose(AMQConnection.java:951) 
[qpid-client-0.32.jar:0.32]
at org.apache.qpid.client.AMQConnection.close(AMQConnection.java:935) 
[qpid-client-0.32.jar:0.32]
at org.apache.qpid.client.AMQConnection.close(AMQConnection.java:916) 
[qpid-client-0.32.jar:0.32]
at 
org.springframework.jms.connection.ConnectionFactoryUtils.releaseConnection(ConnectionFactoryUtils.java:80)
 [spring-jms-4.2.3.RELEASE.jar:4.2.3.RELEASE]
at 
org.springframework.jms.listener.AbstractJmsListeningContainer.refreshSharedConnection(AbstractJmsListeningContainer.java:395)
 [spring-jms-4.2.3.RELEASE.jar:4.2.3.RELEASE]
at 
org.springframework.jms.listener.DefaultMessageListenerContainer.refreshConnectionUntilSuccessful(DefaultMessageListenerContainer.java:915)
 [spring-jms-4.2.3.RELEASE.jar:4.2.3.RELEASE]
at 
org.springframework.jms.listener.DefaultMessageListenerContainer.recoverAfterListenerSetupFailure(DefaultMessageListenerContainer.java:890)
 [spring-jms-4.2.3.RELEASE.jar:4.2.3.RELEASE]
at 
org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.run(DefaultMessageListenerContainer.java:1061)
 [spring-jms-4.2.3.RELEASE.jar:4.2.3.RELEASE]
at java.lang.Thread.run(Thread.java:724) [na:1.7.0_40]
Caused by: org.apache.qpid.AMQTimeoutException: Server did not respond in a 
timely fashion
at 
org.apache.qpid.client.util.BlockingWaiter.block(BlockingWaiter.java:170) 
~[qpid-client-0.32.jar:0.32]
at 
org.apache.qpid.client.protocol.BlockingMethodFrameListener.blockForFrame(BlockingMethodFrameListener.java:115)
 ~[qpid-client-0.32.ja

[jira] [Commented] (PROTON-1841) [cpp] add missing ostream<< and to_string for proton::message

2018-05-08 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PROTON-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16467603#comment-16467603
 ] 

ASF subversion and git services commented on PROTON-1841:
-

Commit 3d46b4f0220e7e56ce5167ab87d4df15f3ca1583 in qpid-proton's branch 
refs/heads/master from [~aconway]
[ https://git-wip-us.apache.org/repos/asf?p=qpid-proton.git;h=3d46b4f ]

PROTON-1841: [cpp] add missing ostream<< and to_string for proton::message


> [cpp] add missing ostream<< and to_string for proton::message
> -
>
> Key: PROTON-1841
> URL: https://issues.apache.org/jira/browse/PROTON-1841
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: cpp-binding
>Affects Versions: proton-c-0.22.0
>Reporter: Alan Conway
>Assignee: Alan Conway
>Priority: Major
> Fix For: proton-c-0.23.0
>
>
> proton::message lacks an ostream operator<< and to_string function, which are 
> provided for proton::value and most other types in the library. It can be 
> implemented using C pn_inspect.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Updated] (PROTON-1816) [c] deprecate old netaddr function names

2018-05-08 Thread Alan Conway (JIRA)

 [ 
https://issues.apache.org/jira/browse/PROTON-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Conway updated PROTON-1816:

Priority: Minor  (was: Major)

> [c] deprecate old netaddr function names
> 
>
> Key: PROTON-1816
> URL: https://issues.apache.org/jira/browse/PROTON-1816
> Project: Qpid Proton
>  Issue Type: Improvement
>  Components: proton-c
>Affects Versions: proton-j-0.22.0
>Reporter: Alan Conway
>Assignee: Alan Conway
>Priority: Minor
> Fix For: proton-c-0.23.0
>
>
> See PROTON-1781 - the functions were re-named but the deprecation macros were 
> commented out to give people a release cycle to adjust to the new names.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Resolved] (PROTON-1841) [cpp] add missing ostream<< and to_string for proton::message

2018-05-08 Thread Alan Conway (JIRA)

 [ 
https://issues.apache.org/jira/browse/PROTON-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Conway resolved PROTON-1841.
-
Resolution: Fixed

> [cpp] add missing ostream<< and to_string for proton::message
> -
>
> Key: PROTON-1841
> URL: https://issues.apache.org/jira/browse/PROTON-1841
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: cpp-binding
>Affects Versions: proton-c-0.22.0
>Reporter: Alan Conway
>Assignee: Alan Conway
>Priority: Major
> Fix For: proton-c-0.23.0
>
>
> proton::message lacks an ostream operator<< and to_string function, which are 
> provided for proton::value and most other types in the library. It can be 
> implemented using C pn_inspect.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Updated] (PROTON-1842) [c] Dispatch/Proton crashes when opening/closing connections

2018-05-08 Thread Alan Conway (JIRA)

 [ 
https://issues.apache.org/jira/browse/PROTON-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Conway updated PROTON-1842:

Attachment: race.vg
race.tsan

> [c] Dispatch/Proton crashes when opening/closing connections
> 
>
> Key: PROTON-1842
> URL: https://issues.apache.org/jira/browse/PROTON-1842
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.22.0
>Reporter: Chuck Rolke
>Priority: Major
> Attachments: helloworld.cpp, race.tsan, race.vg
>
>
> Using proton cpp example code that is modified to open and close connections 
> by the thousands in the main loop and having the event loop short circuit any 
> messaging with:
> {{  void on_connection_open(proton::connection& c) {}}
> {{  c.close();}}
> {{  }}}
> and then directing this client example to a dispatch router 1.1.0. Eventually 
> (after 100,000 to 1,000,000 connection open/closes) the router crashes with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:466: 
> wake_pop_front: Assertion `p->wakes_in_progress' failed.}}
> and with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014: 
> proactor_do_epoll: Assertion `ee->type == PCONNECTION_TIMER' failed.}}
> This issue seems to happen only with qpid-dispatch accepting the open/close 
> event stream. Proton cpp example _server_direct_ and c example _direct_ work 
> properly with the same open/close event stream mounting into the 10s of 
> millions of connections.
> A core dump backtrace with the PCONNECTION_TIMER failure reads as:
> {{(gdb) bt}}
> {{#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51}}
> {{#1  0x7f795c712c41 in __GI_abort () at abort.c:79}}
> {{#2  0x7f795c709f7a in __assert_fail_base (fmt=0x7f795c85a260 
> "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
> assertion=assertion@entry=0x7f795d72e15a "ee->type == PCONNECTION_TIMER", }}
> {{    file=file@entry=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=line@entry=2014, }}
> {{    function=function@entry=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> 
> "proactor_do_epoll") at assert.c:92}}
> {{#3  0x7f795c709ff2 in __GI___assert_fail (assertion=0x7f795d72e15a 
> "ee->type == PCONNECTION_TIMER", file=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=2014, }}
> {{    function=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> "proactor_do_epoll") 
> at assert.c:101}}
> {{#4  0x7f795d72d29f in proactor_do_epoll (p=0x26b7310, can_block=true) 
> at /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014}}
> {{#5  0x7f795d72d30e in pn_proactor_wait (p=0x26b7310) at 
> /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2030}}
> {{#6  0x7f795dbe89ad in thread_run (arg=0x26be750) at 
> /home/chug/git/qpid-dispatch/src/server.c:946}}
> {{#7  0x7f795d50e50b in start_thread (arg=0x7f794f486700) at 
> pthread_create.c:465}}
> {{#8  0x7f795c7d216f in clone () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95}}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-1842) [c] Dispatch/Proton crashes when opening/closing connections

2018-05-08 Thread Alan Conway (JIRA)

[ 
https://issues.apache.org/jira/browse/PROTON-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16467645#comment-16467645
 ] 

Alan Conway commented on PROTON-1842:
-

The threaderciser is showing races in connection close, I'm not sure if they 
are the same issue we are looking at here. Attached output race.vg and 
race.tsan from helgrind and the thread sanitizer. Valigrind detects a *lot* 
more races, probaby because it is slowing things down so much, but the tsan 
stack traces are consistent with valgrind.

> [c] Dispatch/Proton crashes when opening/closing connections
> 
>
> Key: PROTON-1842
> URL: https://issues.apache.org/jira/browse/PROTON-1842
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.22.0
>Reporter: Chuck Rolke
>Priority: Major
> Attachments: helloworld.cpp, race.tsan, race.vg
>
>
> Using proton cpp example code that is modified to open and close connections 
> by the thousands in the main loop and having the event loop short circuit any 
> messaging with:
> {{  void on_connection_open(proton::connection& c) {}}
> {{  c.close();}}
> {{  }}}
> and then directing this client example to a dispatch router 1.1.0. Eventually 
> (after 100,000 to 1,000,000 connection open/closes) the router crashes with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:466: 
> wake_pop_front: Assertion `p->wakes_in_progress' failed.}}
> and with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014: 
> proactor_do_epoll: Assertion `ee->type == PCONNECTION_TIMER' failed.}}
> This issue seems to happen only with qpid-dispatch accepting the open/close 
> event stream. Proton cpp example _server_direct_ and c example _direct_ work 
> properly with the same open/close event stream mounting into the 10s of 
> millions of connections.
> A core dump backtrace with the PCONNECTION_TIMER failure reads as:
> {{(gdb) bt}}
> {{#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51}}
> {{#1  0x7f795c712c41 in __GI_abort () at abort.c:79}}
> {{#2  0x7f795c709f7a in __assert_fail_base (fmt=0x7f795c85a260 
> "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
> assertion=assertion@entry=0x7f795d72e15a "ee->type == PCONNECTION_TIMER", }}
> {{    file=file@entry=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=line@entry=2014, }}
> {{    function=function@entry=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> 
> "proactor_do_epoll") at assert.c:92}}
> {{#3  0x7f795c709ff2 in __GI___assert_fail (assertion=0x7f795d72e15a 
> "ee->type == PCONNECTION_TIMER", file=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=2014, }}
> {{    function=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> "proactor_do_epoll") 
> at assert.c:101}}
> {{#4  0x7f795d72d29f in proactor_do_epoll (p=0x26b7310, can_block=true) 
> at /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014}}
> {{#5  0x7f795d72d30e in pn_proactor_wait (p=0x26b7310) at 
> /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2030}}
> {{#6  0x7f795dbe89ad in thread_run (arg=0x26be750) at 
> /home/chug/git/qpid-dispatch/src/server.c:946}}
> {{#7  0x7f795d50e50b in start_thread (arg=0x7f794f486700) at 
> pthread_create.c:465}}
> {{#8  0x7f795c7d216f in clone () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95}}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Comment Edited] (PROTON-1842) [c] Dispatch/Proton crashes when opening/closing connections

2018-05-08 Thread Alan Conway (JIRA)

[ 
https://issues.apache.org/jira/browse/PROTON-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16467645#comment-16467645
 ] 

Alan Conway edited comment on PROTON-1842 at 5/8/18 4:36 PM:
-

The threaderciser is showing races in connection close, I'm not sure if they 
are the same issue we are looking at here. Attached output race.vg and 
race.tsan from helgrind and the thread sanitizer. Valigrind detects a *lot* 
more races, probaby because it is slowing things down so much, but the tsan 
stack traces are consistent with valgrind.

This looks consistent with your theory, in particular a mutex being destroyed 
concurrently with being unlocked during shutdown. One thread locks, sees 
everything is ready to finalize and destroys the connection state while the 
second thread is blocked on the mutex - it gets released when the first thread 
unlocks before pthread_destroy but explodes when it tries to unlock after the 
destroy.


was (Author: aconway):
The threaderciser is showing races in connection close, I'm not sure if they 
are the same issue we are looking at here. Attached output race.vg and 
race.tsan from helgrind and the thread sanitizer. Valigrind detects a *lot* 
more races, probaby because it is slowing things down so much, but the tsan 
stack traces are consistent with valgrind.

> [c] Dispatch/Proton crashes when opening/closing connections
> 
>
> Key: PROTON-1842
> URL: https://issues.apache.org/jira/browse/PROTON-1842
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.22.0
>Reporter: Chuck Rolke
>Priority: Major
> Attachments: helloworld.cpp, race.tsan, race.vg
>
>
> Using proton cpp example code that is modified to open and close connections 
> by the thousands in the main loop and having the event loop short circuit any 
> messaging with:
> {{  void on_connection_open(proton::connection& c) {}}
> {{  c.close();}}
> {{  }}}
> and then directing this client example to a dispatch router 1.1.0. Eventually 
> (after 100,000 to 1,000,000 connection open/closes) the router crashes with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:466: 
> wake_pop_front: Assertion `p->wakes_in_progress' failed.}}
> and with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014: 
> proactor_do_epoll: Assertion `ee->type == PCONNECTION_TIMER' failed.}}
> This issue seems to happen only with qpid-dispatch accepting the open/close 
> event stream. Proton cpp example _server_direct_ and c example _direct_ work 
> properly with the same open/close event stream mounting into the 10s of 
> millions of connections.
> A core dump backtrace with the PCONNECTION_TIMER failure reads as:
> {{(gdb) bt}}
> {{#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51}}
> {{#1  0x7f795c712c41 in __GI_abort () at abort.c:79}}
> {{#2  0x7f795c709f7a in __assert_fail_base (fmt=0x7f795c85a260 
> "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
> assertion=assertion@entry=0x7f795d72e15a "ee->type == PCONNECTION_TIMER", }}
> {{    file=file@entry=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=line@entry=2014, }}
> {{    function=function@entry=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> 
> "proactor_do_epoll") at assert.c:92}}
> {{#3  0x7f795c709ff2 in __GI___assert_fail (assertion=0x7f795d72e15a 
> "ee->type == PCONNECTION_TIMER", file=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=2014, }}
> {{    function=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> "proactor_do_epoll") 
> at assert.c:101}}
> {{#4  0x7f795d72d29f in proactor_do_epoll (p=0x26b7310, can_block=true) 
> at /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014}}
> {{#5  0x7f795d72d30e in pn_proactor_wait (p=0x26b7310) at 
> /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2030}}
> {{#6  0x7f795dbe89ad in thread_run (arg=0x26be750) at 
> /home/chug/git/qpid-dispatch/src/server.c:946}}
> {{#7  0x7f795d50e50b in start_thread (arg=0x7f794f486700) at 
> pthread_create.c:465}}
> {{#8  0x7f795c7d216f in clone () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95}}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Comment Edited] (PROTON-1842) [c] Dispatch/Proton crashes when opening/closing connections

2018-05-08 Thread Alan Conway (JIRA)

[ 
https://issues.apache.org/jira/browse/PROTON-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16467645#comment-16467645
 ] 

Alan Conway edited comment on PROTON-1842 at 5/8/18 4:52 PM:
-

The threaderciser is showing races in connection close, I'm not sure if they 
are the same issue we are looking at here. Attached output race.vg and 
race.tsan from helgrind and the thread sanitizer. Valigrind detects a *lot* 
more races, probaby because it is slowing things down so much, but the tsan 
stack traces are consistent with valgrind.

This looks consistent with your theory, in particular a mutex being destroyed 
concurrently with being unlocked during shutdown. One thread locks, sees 
everything is ready to finalize and destroys the connection state while the 
second thread is blocked on the mutex - it gets released when the first thread 
unlocks before pthread_destroy but explodes when it tries to unlock after the 
destroy.

To run:
{code:java}
cmake -DTHREADERCISER=ON .. && make && valgrind --tool=helgrind 
c/tests/c-threaderciser -time 60
cmake -DENABLE_TSAN=ON -DTHREADERCISER=ON .. && make && c/tests/c-threaderciser 
-time 60{code}
 


was (Author: aconway):
The threaderciser is showing races in connection close, I'm not sure if they 
are the same issue we are looking at here. Attached output race.vg and 
race.tsan from helgrind and the thread sanitizer. Valigrind detects a *lot* 
more races, probaby because it is slowing things down so much, but the tsan 
stack traces are consistent with valgrind.

This looks consistent with your theory, in particular a mutex being destroyed 
concurrently with being unlocked during shutdown. One thread locks, sees 
everything is ready to finalize and destroys the connection state while the 
second thread is blocked on the mutex - it gets released when the first thread 
unlocks before pthread_destroy but explodes when it tries to unlock after the 
destroy.

> [c] Dispatch/Proton crashes when opening/closing connections
> 
>
> Key: PROTON-1842
> URL: https://issues.apache.org/jira/browse/PROTON-1842
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.22.0
>Reporter: Chuck Rolke
>Priority: Major
> Attachments: helloworld.cpp, race.tsan, race.vg
>
>
> Using proton cpp example code that is modified to open and close connections 
> by the thousands in the main loop and having the event loop short circuit any 
> messaging with:
> {{  void on_connection_open(proton::connection& c) {}}
> {{  c.close();}}
> {{  }}}
> and then directing this client example to a dispatch router 1.1.0. Eventually 
> (after 100,000 to 1,000,000 connection open/closes) the router crashes with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:466: 
> wake_pop_front: Assertion `p->wakes_in_progress' failed.}}
> and with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014: 
> proactor_do_epoll: Assertion `ee->type == PCONNECTION_TIMER' failed.}}
> This issue seems to happen only with qpid-dispatch accepting the open/close 
> event stream. Proton cpp example _server_direct_ and c example _direct_ work 
> properly with the same open/close event stream mounting into the 10s of 
> millions of connections.
> A core dump backtrace with the PCONNECTION_TIMER failure reads as:
> {{(gdb) bt}}
> {{#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51}}
> {{#1  0x7f795c712c41 in __GI_abort () at abort.c:79}}
> {{#2  0x7f795c709f7a in __assert_fail_base (fmt=0x7f795c85a260 
> "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
> assertion=assertion@entry=0x7f795d72e15a "ee->type == PCONNECTION_TIMER", }}
> {{    file=file@entry=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=line@entry=2014, }}
> {{    function=function@entry=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> 
> "proactor_do_epoll") at assert.c:92}}
> {{#3  0x7f795c709ff2 in __GI___assert_fail (assertion=0x7f795d72e15a 
> "ee->type == PCONNECTION_TIMER", file=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=2014, }}
> {{    function=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> "proactor_do_epoll") 
> at assert.c:101}}
> {{#4  0x7f795d72d29f in proactor_do_epoll (p=0x26b7310, can_block=true) 
> at /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014}}
> {{#5  0x7f795d72d30e in pn_proactor_wait (p=0x26b7310) at 
> /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2030}}
> {{#6  0x7f795dbe89ad in thread_run (arg=0x26be750) at 
> /home/chug/git/qpid-dispatch/src/server.c:946}}
> {{#7  0x7f795d50e50b in start_thread (arg=0x7f794f486700) at 
> pthread_create.c:465}}
> {{#8  0x7f795c7d216f in clone () at 
> ../sysdeps/unix/sy

[jira] [Commented] (DISPATCH-989) symlinks in tree to non existent files, possibly stale and could be removed?

2018-05-08 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/DISPATCH-989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16467720#comment-16467720
 ] 

ASF subversion and git services commented on DISPATCH-989:
--

Commit 7b2d8225e28c295328d3926b9dc5b26d44795540 in qpid-dispatch's branch 
refs/heads/master from [~eallen]
[ https://git-wip-us.apache.org/repos/asf?p=qpid-dispatch.git;h=7b2d822 ]

DISPATCH-989 Replace broken symlink with original files. Add npm dependency on 
rhea.


> symlinks in tree to non existent files, possibly stale and could be removed?
> 
>
> Key: DISPATCH-989
> URL: https://issues.apache.org/jira/browse/DISPATCH-989
> Project: Qpid Dispatch
>  Issue Type: Task
>  Components: Console
>Affects Versions: 1.1.0
>Reporter: Robbie Gemmell
>Assignee: Ernest Allen
>Priority: Minor
> Fix For: 1.2.0
>
>
> There are a number of symlinks in the console tree which point to files that 
> no longer exist. Its not clear these are actually required anymore, and may 
> be stale and could be removed? The targets were seemingly removed by 
> DISPATCH-917
> Dir console/test/css/:
> brokers.ttf -> ../../stand-alone/plugin/css/brokers.ttf
> dispatch.css -> ../../stand-alone/plugin/css/dispatch.css
> plugin.css -> ../../stand-alone/plugin/css/plugin.css
> site-base.css -> ../../stand-alone/plugin/css/site-base.css
> Dir console/test/html/:
> qdrConnect.html -> ../../stand-alone/plugin/html/qdrConnect.html
> qdrLayout.html -> ../../stand-alone/plugin/html/qdrLayout.html
> Dir console/test/js/:
> qdrService.js -> ../../stand-alone/plugin/js/qdrService.js
> Dir console/test/lib/:
> rhea-min.js -> ../../stand-alone/plugin/lib/rhea-min.js



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Resolved] (DISPATCH-989) symlinks in tree to non existent files, possibly stale and could be removed?

2018-05-08 Thread Ernest Allen (JIRA)

 [ 
https://issues.apache.org/jira/browse/DISPATCH-989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ernest Allen resolved DISPATCH-989.
---
Resolution: Fixed

This tool should be deprecated and then removed from the source tree in a later 
release.

> symlinks in tree to non existent files, possibly stale and could be removed?
> 
>
> Key: DISPATCH-989
> URL: https://issues.apache.org/jira/browse/DISPATCH-989
> Project: Qpid Dispatch
>  Issue Type: Task
>  Components: Console
>Affects Versions: 1.1.0
>Reporter: Robbie Gemmell
>Assignee: Ernest Allen
>Priority: Minor
> Fix For: 1.2.0
>
>
> There are a number of symlinks in the console tree which point to files that 
> no longer exist. Its not clear these are actually required anymore, and may 
> be stale and could be removed? The targets were seemingly removed by 
> DISPATCH-917
> Dir console/test/css/:
> brokers.ttf -> ../../stand-alone/plugin/css/brokers.ttf
> dispatch.css -> ../../stand-alone/plugin/css/dispatch.css
> plugin.css -> ../../stand-alone/plugin/css/plugin.css
> site-base.css -> ../../stand-alone/plugin/css/site-base.css
> Dir console/test/html/:
> qdrConnect.html -> ../../stand-alone/plugin/html/qdrConnect.html
> qdrLayout.html -> ../../stand-alone/plugin/html/qdrLayout.html
> Dir console/test/js/:
> qdrService.js -> ../../stand-alone/plugin/js/qdrService.js
> Dir console/test/lib/:
> rhea-min.js -> ../../stand-alone/plugin/lib/rhea-min.js



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-1842) [c] Dispatch/Proton crashes when opening/closing connections

2018-05-08 Thread Cliff Jansen (JIRA)

[ 
https://issues.apache.org/jira/browse/PROTON-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16467772#comment-16467772
 ] 

Cliff Jansen commented on PROTON-1842:
--

Thank-you for this additional info.

Yes, these look like the same types of stack traces after a dust up.

If they truly are, for me the pconnection_done thread (T2) got the R socket 
event, the last event batch, begin_close and free.

The competing thread (T4) got the RW socket event, then failed in numerous ways 
depending on what was in the torn down or reused freed memory.

Catastrophic stuff happens for about 1 in 1 connections for a debug build 
on a middle-aged 4c/8t desktop machine.

> [c] Dispatch/Proton crashes when opening/closing connections
> 
>
> Key: PROTON-1842
> URL: https://issues.apache.org/jira/browse/PROTON-1842
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.22.0
>Reporter: Chuck Rolke
>Priority: Major
> Attachments: helloworld.cpp, race.tsan, race.vg
>
>
> Using proton cpp example code that is modified to open and close connections 
> by the thousands in the main loop and having the event loop short circuit any 
> messaging with:
> {{  void on_connection_open(proton::connection& c) {}}
> {{  c.close();}}
> {{  }}}
> and then directing this client example to a dispatch router 1.1.0. Eventually 
> (after 100,000 to 1,000,000 connection open/closes) the router crashes with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:466: 
> wake_pop_front: Assertion `p->wakes_in_progress' failed.}}
> and with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014: 
> proactor_do_epoll: Assertion `ee->type == PCONNECTION_TIMER' failed.}}
> This issue seems to happen only with qpid-dispatch accepting the open/close 
> event stream. Proton cpp example _server_direct_ and c example _direct_ work 
> properly with the same open/close event stream mounting into the 10s of 
> millions of connections.
> A core dump backtrace with the PCONNECTION_TIMER failure reads as:
> {{(gdb) bt}}
> {{#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51}}
> {{#1  0x7f795c712c41 in __GI_abort () at abort.c:79}}
> {{#2  0x7f795c709f7a in __assert_fail_base (fmt=0x7f795c85a260 
> "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
> assertion=assertion@entry=0x7f795d72e15a "ee->type == PCONNECTION_TIMER", }}
> {{    file=file@entry=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=line@entry=2014, }}
> {{    function=function@entry=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> 
> "proactor_do_epoll") at assert.c:92}}
> {{#3  0x7f795c709ff2 in __GI___assert_fail (assertion=0x7f795d72e15a 
> "ee->type == PCONNECTION_TIMER", file=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=2014, }}
> {{    function=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> "proactor_do_epoll") 
> at assert.c:101}}
> {{#4  0x7f795d72d29f in proactor_do_epoll (p=0x26b7310, can_block=true) 
> at /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014}}
> {{#5  0x7f795d72d30e in pn_proactor_wait (p=0x26b7310) at 
> /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2030}}
> {{#6  0x7f795dbe89ad in thread_run (arg=0x26be750) at 
> /home/chug/git/qpid-dispatch/src/server.c:946}}
> {{#7  0x7f795d50e50b in start_thread (arg=0x7f794f486700) at 
> pthread_create.c:465}}
> {{#8  0x7f795c7d216f in clone () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95}}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-1842) [c] Dispatch/Proton crashes when opening/closing connections

2018-05-08 Thread Alan Conway (JIRA)

[ 
https://issues.apache.org/jira/browse/PROTON-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16467806#comment-16467806
 ] 

Alan Conway commented on PROTON-1842:
-

Note that the only way connections get closed in this version of the 
threaderciser is because a listener was closed and the connection failed 
(nobody listening, bad port, socket closed unexpectedly) so at least in the 
threaderciser version this happening as a result of an early error while the 
connection is possibly not completely set up. I'm adding socket kills to the 
connection socket now, will let you know how that goes.

> [c] Dispatch/Proton crashes when opening/closing connections
> 
>
> Key: PROTON-1842
> URL: https://issues.apache.org/jira/browse/PROTON-1842
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.22.0
>Reporter: Chuck Rolke
>Priority: Major
> Attachments: helloworld.cpp, race.tsan, race.vg
>
>
> Using proton cpp example code that is modified to open and close connections 
> by the thousands in the main loop and having the event loop short circuit any 
> messaging with:
> {{  void on_connection_open(proton::connection& c) {}}
> {{  c.close();}}
> {{  }}}
> and then directing this client example to a dispatch router 1.1.0. Eventually 
> (after 100,000 to 1,000,000 connection open/closes) the router crashes with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:466: 
> wake_pop_front: Assertion `p->wakes_in_progress' failed.}}
> and with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014: 
> proactor_do_epoll: Assertion `ee->type == PCONNECTION_TIMER' failed.}}
> This issue seems to happen only with qpid-dispatch accepting the open/close 
> event stream. Proton cpp example _server_direct_ and c example _direct_ work 
> properly with the same open/close event stream mounting into the 10s of 
> millions of connections.
> A core dump backtrace with the PCONNECTION_TIMER failure reads as:
> {{(gdb) bt}}
> {{#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51}}
> {{#1  0x7f795c712c41 in __GI_abort () at abort.c:79}}
> {{#2  0x7f795c709f7a in __assert_fail_base (fmt=0x7f795c85a260 
> "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
> assertion=assertion@entry=0x7f795d72e15a "ee->type == PCONNECTION_TIMER", }}
> {{    file=file@entry=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=line@entry=2014, }}
> {{    function=function@entry=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> 
> "proactor_do_epoll") at assert.c:92}}
> {{#3  0x7f795c709ff2 in __GI___assert_fail (assertion=0x7f795d72e15a 
> "ee->type == PCONNECTION_TIMER", file=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=2014, }}
> {{    function=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> "proactor_do_epoll") 
> at assert.c:101}}
> {{#4  0x7f795d72d29f in proactor_do_epoll (p=0x26b7310, can_block=true) 
> at /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014}}
> {{#5  0x7f795d72d30e in pn_proactor_wait (p=0x26b7310) at 
> /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2030}}
> {{#6  0x7f795dbe89ad in thread_run (arg=0x26be750) at 
> /home/chug/git/qpid-dispatch/src/server.c:946}}
> {{#7  0x7f795d50e50b in start_thread (arg=0x7f794f486700) at 
> pthread_create.c:465}}
> {{#8  0x7f795c7d216f in clone () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95}}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (DISPATCH-991) Master qdstat throws keyError when running against 1.0.1 router

2018-05-08 Thread Ganesh Murthy (JIRA)
Ganesh Murthy created DISPATCH-991:
--

 Summary: Master qdstat throws keyError when running against 1.0.1 
router
 Key: DISPATCH-991
 URL: https://issues.apache.org/jira/browse/DISPATCH-991
 Project: Qpid Dispatch
  Issue Type: Bug
  Components: Management Agent
Affects Versions: 1.1.0
Reporter: Ganesh Murthy
Assignee: Ganesh Murthy
 Fix For: 1.1.0


When running the master qdstat against a previously released 1.0.1 version of 
the router the following error is put out -

KeyError: 'presettledDeliveries



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (DISPATCH-991) Master qdstat throws keyError when running against 1.0.1 router

2018-05-08 Thread Ganesh Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/DISPATCH-991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16467982#comment-16467982
 ] 

Ganesh Murthy commented on DISPATCH-991:


Also put back the area field that was accidentally removed from the new version.

> Master qdstat throws keyError when running against 1.0.1 router
> ---
>
> Key: DISPATCH-991
> URL: https://issues.apache.org/jira/browse/DISPATCH-991
> Project: Qpid Dispatch
>  Issue Type: Bug
>  Components: Management Agent
>Affects Versions: 1.1.0
>Reporter: Ganesh Murthy
>Assignee: Ganesh Murthy
>Priority: Major
> Fix For: 1.1.0
>
>
> When running the master qdstat against a previously released 1.0.1 version of 
> the router the following error is put out -
> KeyError: 'presettledDeliveries



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (DISPATCH-991) Master qdstat throws keyError when running against 1.0.1 router

2018-05-08 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/DISPATCH-991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16467997#comment-16467997
 ] 

ASF subversion and git services commented on DISPATCH-991:
--

Commit 448605e2a7d4cd724ab5d0659e060b11f4841994 in qpid-dispatch's branch 
refs/heads/master from [~ganeshmurthy]
[ https://git-wip-us.apache.org/repos/asf?p=qpid-dispatch.git;h=448605e ]

DISPATCH-991 - Added back area attribute and fixed the keyError. Now qdstat 
will be backward compatible

(cherry picked from commit 90b04701f01ea53fb00efc8b5d44c321bb78dc79)


> Master qdstat throws keyError when running against 1.0.1 router
> ---
>
> Key: DISPATCH-991
> URL: https://issues.apache.org/jira/browse/DISPATCH-991
> Project: Qpid Dispatch
>  Issue Type: Bug
>  Components: Management Agent
>Affects Versions: 1.1.0
>Reporter: Ganesh Murthy
>Assignee: Ganesh Murthy
>Priority: Major
> Fix For: 1.1.0
>
>
> When running the master qdstat against a previously released 1.0.1 version of 
> the router the following error is put out -
> KeyError: 'presettledDeliveries



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (DISPATCH-992) System test is failing in some scenarios - system_tests_delivery_abort.py

2018-05-08 Thread Fernando Giorgetti (JIRA)
Fernando Giorgetti created DISPATCH-992:
---

 Summary: System test is failing in some scenarios - 
system_tests_delivery_abort.py
 Key: DISPATCH-992
 URL: https://issues.apache.org/jira/browse/DISPATCH-992
 Project: Qpid Dispatch
  Issue Type: Bug
  Components: Tests
Reporter: Fernando Giorgetti


In some machines, we were able to see that system_tests_delivery_abort.py test 
is failing (only the truncate tests) as on_aborted() method is not being 
invoked.

After debugging the test and along with the router code, it ended out being a 
timing issue on some machines. Basically when the sender's close() method is 
called (like at line 218), the headers have not yet been sent from the router 
(with aborted=true), so on_aborted is never invoked on the test.

Using a bigger data to stream, like 100 instead of 10 (or even sleeping 
for 1 second before closing the sender), it gives enough time for the headers 
to be sent and then test passes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Resolved] (DISPATCH-991) Master qdstat throws keyError when running against 1.0.1 router

2018-05-08 Thread Ganesh Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/DISPATCH-991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ganesh Murthy resolved DISPATCH-991.

Resolution: Fixed

> Master qdstat throws keyError when running against 1.0.1 router
> ---
>
> Key: DISPATCH-991
> URL: https://issues.apache.org/jira/browse/DISPATCH-991
> Project: Qpid Dispatch
>  Issue Type: Bug
>  Components: Management Agent
>Affects Versions: 1.1.0
>Reporter: Ganesh Murthy
>Assignee: Ganesh Murthy
>Priority: Major
> Fix For: 1.1.0
>
>
> When running the master qdstat against a previously released 1.0.1 version of 
> the router the following error is put out -
> KeyError: 'presettledDeliveries



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[GitHub] qpid-dispatch pull request #302: DISPATCH-992: Fix for system_tests_delivery...

2018-05-08 Thread fgiorgetti
GitHub user fgiorgetti opened a pull request:

https://github.com/apache/qpid-dispatch/pull/302

DISPATCH-992: Fix for system_tests_delivery_abort.py



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/fgiorgetti/qpid-dispatch 
fgiorgetti-DISPATCH-992

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/qpid-dispatch/pull/302.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #302


commit 95d1ba5e1d8882dd624b451072837396f201baa5
Author: Fernando Giorgetti 
Date:   2018-05-08T21:43:21Z

DISPATCH-992: Fix for system_tests_delivery_abort.py




---

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (DISPATCH-992) System test is failing in some scenarios - system_tests_delivery_abort.py

2018-05-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DISPATCH-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16468000#comment-16468000
 ] 

ASF GitHub Bot commented on DISPATCH-992:
-

GitHub user fgiorgetti opened a pull request:

https://github.com/apache/qpid-dispatch/pull/302

DISPATCH-992: Fix for system_tests_delivery_abort.py



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/fgiorgetti/qpid-dispatch 
fgiorgetti-DISPATCH-992

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/qpid-dispatch/pull/302.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #302


commit 95d1ba5e1d8882dd624b451072837396f201baa5
Author: Fernando Giorgetti 
Date:   2018-05-08T21:43:21Z

DISPATCH-992: Fix for system_tests_delivery_abort.py




> System test is failing in some scenarios - system_tests_delivery_abort.py
> -
>
> Key: DISPATCH-992
> URL: https://issues.apache.org/jira/browse/DISPATCH-992
> Project: Qpid Dispatch
>  Issue Type: Bug
>  Components: Tests
>Reporter: Fernando Giorgetti
>Priority: Major
>
> In some machines, we were able to see that system_tests_delivery_abort.py 
> test is failing (only the truncate tests) as on_aborted() method is not being 
> invoked.
> After debugging the test and along with the router code, it ended out being a 
> timing issue on some machines. Basically when the sender's close() method is 
> called (like at line 218), the headers have not yet been sent from the router 
> (with aborted=true), so on_aborted is never invoked on the test.
> Using a bigger data to stream, like 100 instead of 10 (or even 
> sleeping for 1 second before closing the sender), it gives enough time for 
> the headers to be sent and then test passes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[GitHub] qpid-dispatch pull request #302: DISPATCH-992: Fix for system_tests_delivery...

2018-05-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/qpid-dispatch/pull/302


---

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (DISPATCH-992) System test is failing in some scenarios - system_tests_delivery_abort.py

2018-05-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DISPATCH-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16468138#comment-16468138
 ] 

ASF GitHub Bot commented on DISPATCH-992:
-

Github user asfgit closed the pull request at:

https://github.com/apache/qpid-dispatch/pull/302


> System test is failing in some scenarios - system_tests_delivery_abort.py
> -
>
> Key: DISPATCH-992
> URL: https://issues.apache.org/jira/browse/DISPATCH-992
> Project: Qpid Dispatch
>  Issue Type: Bug
>  Components: Tests
>Reporter: Fernando Giorgetti
>Priority: Major
>
> In some machines, we were able to see that system_tests_delivery_abort.py 
> test is failing (only the truncate tests) as on_aborted() method is not being 
> invoked.
> After debugging the test and along with the router code, it ended out being a 
> timing issue on some machines. Basically when the sender's close() method is 
> called (like at line 218), the headers have not yet been sent from the router 
> (with aborted=true), so on_aborted is never invoked on the test.
> Using a bigger data to stream, like 100 instead of 10 (or even 
> sleeping for 1 second before closing the sender), it gives enough time for 
> the headers to be sent and then test passes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (DISPATCH-992) System test is failing in some scenarios - system_tests_delivery_abort.py

2018-05-08 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/DISPATCH-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16468137#comment-16468137
 ] 

ASF subversion and git services commented on DISPATCH-992:
--

Commit 95d1ba5e1d8882dd624b451072837396f201baa5 in qpid-dispatch's branch 
refs/heads/master from [~fgiorget]
[ https://git-wip-us.apache.org/repos/asf?p=qpid-dispatch.git;h=95d1ba5 ]

DISPATCH-992: Fix for system_tests_delivery_abort.py


> System test is failing in some scenarios - system_tests_delivery_abort.py
> -
>
> Key: DISPATCH-992
> URL: https://issues.apache.org/jira/browse/DISPATCH-992
> Project: Qpid Dispatch
>  Issue Type: Bug
>  Components: Tests
>Reporter: Fernando Giorgetti
>Priority: Major
>
> In some machines, we were able to see that system_tests_delivery_abort.py 
> test is failing (only the truncate tests) as on_aborted() method is not being 
> invoked.
> After debugging the test and along with the router code, it ended out being a 
> timing issue on some machines. Basically when the sender's close() method is 
> called (like at line 218), the headers have not yet been sent from the router 
> (with aborted=true), so on_aborted is never invoked on the test.
> Using a bigger data to stream, like 100 instead of 10 (or even 
> sleeping for 1 second before closing the sender), it gives enough time for 
> the headers to be sent and then test passes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Resolved] (DISPATCH-992) System test is failing in some scenarios - system_tests_delivery_abort.py

2018-05-08 Thread Ganesh Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/DISPATCH-992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ganesh Murthy resolved DISPATCH-992.

   Resolution: Fixed
Fix Version/s: 1.2.0

> System test is failing in some scenarios - system_tests_delivery_abort.py
> -
>
> Key: DISPATCH-992
> URL: https://issues.apache.org/jira/browse/DISPATCH-992
> Project: Qpid Dispatch
>  Issue Type: Bug
>  Components: Tests
>Reporter: Fernando Giorgetti
>Priority: Major
> Fix For: 1.2.0
>
>
> In some machines, we were able to see that system_tests_delivery_abort.py 
> test is failing (only the truncate tests) as on_aborted() method is not being 
> invoked.
> After debugging the test and along with the router code, it ended out being a 
> timing issue on some machines. Basically when the sender's close() method is 
> called (like at line 218), the headers have not yet been sent from the router 
> (with aborted=true), so on_aborted is never invoked on the test.
> Using a bigger data to stream, like 100 instead of 10 (or even 
> sleeping for 1 second before closing the sender), it gives enough time for 
> the headers to be sent and then test passes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-1771) [c-proactor] multi-thread race test for proactor

2018-05-08 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PROTON-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16468169#comment-16468169
 ] 

ASF subversion and git services commented on PROTON-1771:
-

Commit 94dfe1bf033f7d4b9183bbad75b1801d688a300d in qpid-proton's branch 
refs/heads/master from [~aconway]
[ https://git-wip-us.apache.org/repos/asf?p=qpid-proton.git;h=94dfe1b ]

PROTON-1771: [c] add -close-connnect, -cancel-timeout to threaderciser

Also added -no-xxx flags to disable selected actions


> [c-proactor] multi-thread race test for proactor
> 
>
> Key: PROTON-1771
> URL: https://issues.apache.org/jira/browse/PROTON-1771
> Project: Qpid Proton
>  Issue Type: Test
>  Components: proton-c
>Affects Versions: proton-c-0.20.0
>Reporter: Alan Conway
>Assignee: Alan Conway
>Priority: Major
> Fix For: proton-c-0.23.0
>
>
> Crate a new test exe that runs for a (configurable, default short) period of
> time, with a single proactor acted on by multiple proactor and user threads. 
> Run
> with helgrind or tsan to detect races.
> Exercise potentially racy APIs concurrently:
> - making, accepting and closing (from both ends) a connection.
> - pn_connection_wake
> - pn_proactor_release_connection
> - re-use of released pn_connection_t on a new connection
> - timeout
> - concurrent with some normal use: sending/receiving messages.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-1842) [c] Dispatch/Proton crashes when opening/closing connections

2018-05-08 Thread Alan Conway (JIRA)

[ 
https://issues.apache.org/jira/browse/PROTON-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16468176#comment-16468176
 ] 

Alan Conway commented on PROTON-1842:
-

Another note, the latest threaderciser shows the race with flags "-listen 
-connect -close-listen" so the only things that are racing here are IO events 
from connection errors and procator-generated wakes - there are no user wakes 
involved.

> [c] Dispatch/Proton crashes when opening/closing connections
> 
>
> Key: PROTON-1842
> URL: https://issues.apache.org/jira/browse/PROTON-1842
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.22.0
>Reporter: Chuck Rolke
>Priority: Major
> Attachments: helloworld.cpp, race.tsan, race.vg
>
>
> Using proton cpp example code that is modified to open and close connections 
> by the thousands in the main loop and having the event loop short circuit any 
> messaging with:
> {{  void on_connection_open(proton::connection& c) {}}
> {{  c.close();}}
> {{  }}}
> and then directing this client example to a dispatch router 1.1.0. Eventually 
> (after 100,000 to 1,000,000 connection open/closes) the router crashes with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:466: 
> wake_pop_front: Assertion `p->wakes_in_progress' failed.}}
> and with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014: 
> proactor_do_epoll: Assertion `ee->type == PCONNECTION_TIMER' failed.}}
> This issue seems to happen only with qpid-dispatch accepting the open/close 
> event stream. Proton cpp example _server_direct_ and c example _direct_ work 
> properly with the same open/close event stream mounting into the 10s of 
> millions of connections.
> A core dump backtrace with the PCONNECTION_TIMER failure reads as:
> {{(gdb) bt}}
> {{#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51}}
> {{#1  0x7f795c712c41 in __GI_abort () at abort.c:79}}
> {{#2  0x7f795c709f7a in __assert_fail_base (fmt=0x7f795c85a260 
> "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
> assertion=assertion@entry=0x7f795d72e15a "ee->type == PCONNECTION_TIMER", }}
> {{    file=file@entry=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=line@entry=2014, }}
> {{    function=function@entry=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> 
> "proactor_do_epoll") at assert.c:92}}
> {{#3  0x7f795c709ff2 in __GI___assert_fail (assertion=0x7f795d72e15a 
> "ee->type == PCONNECTION_TIMER", file=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=2014, }}
> {{    function=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> "proactor_do_epoll") 
> at assert.c:101}}
> {{#4  0x7f795d72d29f in proactor_do_epoll (p=0x26b7310, can_block=true) 
> at /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014}}
> {{#5  0x7f795d72d30e in pn_proactor_wait (p=0x26b7310) at 
> /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2030}}
> {{#6  0x7f795dbe89ad in thread_run (arg=0x26be750) at 
> /home/chug/git/qpid-dispatch/src/server.c:946}}
> {{#7  0x7f795d50e50b in start_thread (arg=0x7f794f486700) at 
> pthread_create.c:465}}
> {{#8  0x7f795c7d216f in clone () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95}}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Comment Edited] (PROTON-1842) [c] Dispatch/Proton crashes when opening/closing connections

2018-05-08 Thread Alan Conway (JIRA)

[ 
https://issues.apache.org/jira/browse/PROTON-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16468176#comment-16468176
 ] 

Alan Conway edited comment on PROTON-1842 at 5/9/18 12:55 AM:
--

Another note, the latest threaderciser shows the race with flags "-listen 
-connect -close-listen" so the only things that are racing here are IO events 
from connection errors and procator-generated wakes - there are no user wakes 
involved.

I am seeing a race betwee pn_proactor_done() (user thread) deciding to finalize 
a connection, and an epoll thread waking up to process it. The epoll thread is 
racing to lock the context mutex while the user thread is deleting it - I'm not 
seeing a crash but it's clear that it could be a crash with the right timing.


was (Author: aconway):
Another note, the latest threaderciser shows the race with flags "-listen 
-connect -close-listen" so the only things that are racing here are IO events 
from connection errors and procator-generated wakes - there are no user wakes 
involved.

> [c] Dispatch/Proton crashes when opening/closing connections
> 
>
> Key: PROTON-1842
> URL: https://issues.apache.org/jira/browse/PROTON-1842
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.22.0
>Reporter: Chuck Rolke
>Priority: Major
> Attachments: helloworld.cpp, race.tsan, race.vg
>
>
> Using proton cpp example code that is modified to open and close connections 
> by the thousands in the main loop and having the event loop short circuit any 
> messaging with:
> {{  void on_connection_open(proton::connection& c) {}}
> {{  c.close();}}
> {{  }}}
> and then directing this client example to a dispatch router 1.1.0. Eventually 
> (after 100,000 to 1,000,000 connection open/closes) the router crashes with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:466: 
> wake_pop_front: Assertion `p->wakes_in_progress' failed.}}
> and with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014: 
> proactor_do_epoll: Assertion `ee->type == PCONNECTION_TIMER' failed.}}
> This issue seems to happen only with qpid-dispatch accepting the open/close 
> event stream. Proton cpp example _server_direct_ and c example _direct_ work 
> properly with the same open/close event stream mounting into the 10s of 
> millions of connections.
> A core dump backtrace with the PCONNECTION_TIMER failure reads as:
> {{(gdb) bt}}
> {{#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51}}
> {{#1  0x7f795c712c41 in __GI_abort () at abort.c:79}}
> {{#2  0x7f795c709f7a in __assert_fail_base (fmt=0x7f795c85a260 
> "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
> assertion=assertion@entry=0x7f795d72e15a "ee->type == PCONNECTION_TIMER", }}
> {{    file=file@entry=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=line@entry=2014, }}
> {{    function=function@entry=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> 
> "proactor_do_epoll") at assert.c:92}}
> {{#3  0x7f795c709ff2 in __GI___assert_fail (assertion=0x7f795d72e15a 
> "ee->type == PCONNECTION_TIMER", file=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=2014, }}
> {{    function=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> "proactor_do_epoll") 
> at assert.c:101}}
> {{#4  0x7f795d72d29f in proactor_do_epoll (p=0x26b7310, can_block=true) 
> at /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014}}
> {{#5  0x7f795d72d30e in pn_proactor_wait (p=0x26b7310) at 
> /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2030}}
> {{#6  0x7f795dbe89ad in thread_run (arg=0x26be750) at 
> /home/chug/git/qpid-dispatch/src/server.c:946}}
> {{#7  0x7f795d50e50b in start_thread (arg=0x7f794f486700) at 
> pthread_create.c:465}}
> {{#8  0x7f795c7d216f in clone () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95}}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Comment Edited] (PROTON-1842) [c] Dispatch/Proton crashes when opening/closing connections

2018-05-08 Thread Alan Conway (JIRA)

[ 
https://issues.apache.org/jira/browse/PROTON-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16468176#comment-16468176
 ] 

Alan Conway edited comment on PROTON-1842 at 5/9/18 1:07 AM:
-

Another note, the latest threaderciser shows the race with flags "-listen 
-connect -close-listen" so the only things that are racing here are IO events 
from connection errors and procator-generated wakes - there are no user wakes 
involved.

I am seeing a race betwee pn_proactor_done() (user thread) deciding to finalize 
a connection, and an epoll thread waking up to process it. The epoll thread is 
racing to lock the context mutex while the user thread is deleting it - I'm not 
seeing a crash but it's clear that it could be a crash with the right timing.

Speculating: we need to bring back something like the ee->mutex to sync around 
epoll mods and waits.  The variables in 

pconnection_is_final(pconnection_t *pc) {
  return !pc->current_arm && !pc->timer_armed && !pc->context.wake_ops;
} 

Need to be synchronized around epoll events, because right now it seems that 
is_final can return true concurrently with epoll_wait returning the same pc, so 
it seems like current_arm is not properly synced.


was (Author: aconway):
Another note, the latest threaderciser shows the race with flags "-listen 
-connect -close-listen" so the only things that are racing here are IO events 
from connection errors and procator-generated wakes - there are no user wakes 
involved.

I am seeing a race betwee pn_proactor_done() (user thread) deciding to finalize 
a connection, and an epoll thread waking up to process it. The epoll thread is 
racing to lock the context mutex while the user thread is deleting it - I'm not 
seeing a crash but it's clear that it could be a crash with the right timing.

> [c] Dispatch/Proton crashes when opening/closing connections
> 
>
> Key: PROTON-1842
> URL: https://issues.apache.org/jira/browse/PROTON-1842
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.22.0
>Reporter: Chuck Rolke
>Priority: Major
> Attachments: helloworld.cpp, race.tsan, race.vg
>
>
> Using proton cpp example code that is modified to open and close connections 
> by the thousands in the main loop and having the event loop short circuit any 
> messaging with:
> {{  void on_connection_open(proton::connection& c) {}}
> {{  c.close();}}
> {{  }}}
> and then directing this client example to a dispatch router 1.1.0. Eventually 
> (after 100,000 to 1,000,000 connection open/closes) the router crashes with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:466: 
> wake_pop_front: Assertion `p->wakes_in_progress' failed.}}
> and with:
> {{qdrouterd: /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014: 
> proactor_do_epoll: Assertion `ee->type == PCONNECTION_TIMER' failed.}}
> This issue seems to happen only with qpid-dispatch accepting the open/close 
> event stream. Proton cpp example _server_direct_ and c example _direct_ work 
> properly with the same open/close event stream mounting into the 10s of 
> millions of connections.
> A core dump backtrace with the PCONNECTION_TIMER failure reads as:
> {{(gdb) bt}}
> {{#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51}}
> {{#1  0x7f795c712c41 in __GI_abort () at abort.c:79}}
> {{#2  0x7f795c709f7a in __assert_fail_base (fmt=0x7f795c85a260 
> "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
> assertion=assertion@entry=0x7f795d72e15a "ee->type == PCONNECTION_TIMER", }}
> {{    file=file@entry=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=line@entry=2014, }}
> {{    function=function@entry=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> 
> "proactor_do_epoll") at assert.c:92}}
> {{#3  0x7f795c709ff2 in __GI___assert_fail (assertion=0x7f795d72e15a 
> "ee->type == PCONNECTION_TIMER", file=0x7f795d72de98 
> "/home/chug/git/qpid-proton/c/src/proactor/epoll.c", line=2014, }}
> {{    function=0x7f795d72e320 <__PRETTY_FUNCTION__.6307> "proactor_do_epoll") 
> at assert.c:101}}
> {{#4  0x7f795d72d29f in proactor_do_epoll (p=0x26b7310, can_block=true) 
> at /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2014}}
> {{#5  0x7f795d72d30e in pn_proactor_wait (p=0x26b7310) at 
> /home/chug/git/qpid-proton/c/src/proactor/epoll.c:2030}}
> {{#6  0x7f795dbe89ad in thread_run (arg=0x26be750) at 
> /home/chug/git/qpid-dispatch/src/server.c:946}}
> {{#7  0x7f795d50e50b in start_thread (arg=0x7f794f486700) at 
> pthread_create.c:465}}
> {{#8  0x7f795c7d216f in clone () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95}}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)